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Preface 


The first time I cracked open a text on ordinary differential equations (ODE), it was 
instant love. I don’t remember the exact title of the book, or the author’s name, 
but I do remember that the book was thinner than the ones I have seen selected 
for use in introductory courses in recent times. It was a book a student could read 
from cover to cover, while taking a course in the subject. I began to write course 
notes, with the aim of producing a text more to my liking. After a few years of 
this, the current book emerged. 


This book has four chapters. I use Chapter 1 and parts of Chapters 2 and 3 for 
a first semester introduction to differential equations, and I use the rest of Chapters 
2 and 3 together with Chapter 4 for the second semester. 


Chapter 1 deals with single differential equations, first equations of order 1, 


dx 
0.1 —=fi(t 
(0.0.1) = = fiz), 
then equations of order 2, 
dx 
(0.0.2) ae f(t, x, 2"). 


We have a brief discussion of higher order equations. For second order equations, 
we concentrate on the case 

dx 
(0.0.3) de = f(z,2'), 
which can be reduced to a first order equation for v = 2’, as a function of z. 
Newton’s law F' = ma for motion of a particle on a line gives such equations. We 
also specialize (0.0.2) to the linear case, 


(0.0.4) x" +ba'+cx = f(t), 


and discuss techniques for solving such equations. 


While the study of single equations is the place to start, the subject of differ- 
ential equations is and always has been mainly about systems of equations. This 
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study requires a healthy dose of linear algebra. For a number of good reasons, it 
is not desirable to require a course in linear algebra as a prerequisite (or even a 
corequisite) for a course in differential equations, but rather the course includes 
some basic instruction in linear algebra. Chapter 2 provides the needed minicourse 
in linear algebra. We differ from most introductions to differential equations in 
providing complete proofs of the relevant results, including material on determi- 
nants, eigenvalues and eigenvectors of a linear transformation, and also generalized 
eigenvectors. 

Chapter 3 deals with linear systems of differential equations. We start with the 
n Xn system 

dx 

0.0.5) pa Az, x(0)=2% €C", 


where A is an n x n matrix, and define the matrix exponential e’4, which produces 
he solution 


0.0.6) x(t) = e429. 


Aaterial from Chapter 2 plays a central role in analyzing this matrix exponential. 
We proceed from (0.0.5) to the inhomogeneous system 


dx _ 


0.0.7) ae Az + f(t), (0) = 20. 
We also study variable coefficient equations 

d. 
0.0.8) a = A(t)a + f(t). 


In particular, we study power series expansions for the solution, when A(t) and f(t) 
are given by convergent power series. We also consider expansions when (0.0.8) has 
a “regular singular point.” These power series topics are usually introduced in the 
context of a single second order equation, before the study of systems. Indeed, in 
Chapter 1, §1.15 touches on this, and §1.16 goes into some detail in the important 
special case of Bessel’s equation. We have saved the general study for Chapter 
3, both to speed the introduction to systems and because the presentation in the 
system context is both more compact and more general than in the context of a 
single, second order equation. 


Chapter 4 crowns the text, with a study of nonlinear systems of differential 
equations. These can have the form 


dx 
(0.0.9) aE (he), (to) =, 
which resembles (0.0.1), except that now x(t) and F(t,x) take values in R". We 
begin with general existence and uniqueness results. For this, we convert (0.0.9) to 


the integral equation 


(0.0.10) x(t) = v+f F(s,2(s)) ds. 


and use the Picard iteration to produce the solution, for |t — to| subject to certain 
limitations, as a limit of a certain sequence of approximate solutions. This is 
followed by results on how large the interval of existence can be taken. Next, we 
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look into results on the smooth dependence of solutions x = x(t,y) to (0.0.9) on 
the initial data y. An important role is played by the linearization of (0.0.9). 


From here we proceed to some qualitative studies of solutions, particularly in 
the autonomous case, F(t, 7) = F(x), in which we interpret F as a vector field, and 
the solution as the flow ® generated by this vector field. One useful tool is the 
phase portrait, which depicts the behavior of solution curves (also called orbits) 
for nonlinear n x n systems. From the point of view of visualization, the portraits 
work particularly well for n = 2, and are also quite useful for n = 3. 


We study a variety of problems from mathematical physics, including the plan- 
etary motion problem for two bodies interacting by the gravitational force, whose 
solution by Newton was a seminal inspiration to the field of ODE. We also bring in 
further advances in the study of equations of physics, due to Euler and Lagrange, 
involving the variational method. This theory impacts both physical and geomet- 
rical applications of ODE, the latter including equations for geodesics on surfaces 
in R”. 

By this point we are looking at nonlinear systems whose solutions are not 
necessarily amenable to formulas. In addition to qualitative studies of the nature 
of these solutions, numerical studies arise as an important tool. This is taken up in 
84.11. We introduce difference schemes, with emphasis on the Runge-Kutta scheme, 
as a very useful computational tool. 


In Sections 4.13—4.14 we turn to some problems arising in mathematical biology. 
This is followed with some results on systems with chaotic dynamics, which arise in 
dimension > 3. This chapter closes with a number of appendices, some providing 
useful background in calculus, and others taking up further topics in nonlinear 
systems of ODE. 


We follow this introduction with a record of some standard notation that will 
be in use throughout the text. 


Some basic notation 


R is the set of real numbers. 

C is the set of complex numbers. 

Z is the set of integers. 

Z* is the set of integers > 0. 

N is the set of integers > 1 (the “natural numbers” ). 

Q is the set of rational numbers. 

zx € R means z is an element of R, i.e., x is a real number. 

(a,b) denotes the set of ¢ € R such thata< «<b. 

[a, b] denotes the set of « € R such that a< x <b. 

{x €R:a< x < b} denotes the set of x in R such thata<a<b. 


[a,b) ={@ €R:a<a <b} and (a,b) = {we R:a<a< bd}. 


Z=a-tifz=r+iyec, cyeR. 
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f :A-— B denotes that the function f takes points in the set A to points 
in B. One also says f maps A to B. 


x — Xo means the variable x tends to the limit x. 


as 
Chapter 1 


Single differential equations 


This first chapter is devoted to differential equations for a single unknown function, 
with emphasis on equations of the first and second order, i.e., 


dx 
1.0.1) ae f(t,x), 
and 
ax dx 
1.0.2) a= (2, aoe 


Section 1.1 looks at the simplest case of (1.0.1), namely 


dx 
1.0. at 
0.3) ra 
We construct the solution 2(t) to (1.0.3) such that 7(0) = 1 as a power series, 
defining the exponential function 
(1.0.4) x(t) =e. 


More generally, x(t) = e® solves dr/dt = cx, with x(0) = 1. This holds for all 
real c and also for complex c. Taking c = i and investigating basic properties of 
x(t) = e”, we establish Euler’s formula, 


(1.0.5) e”’ = cost + isint, 
which in turn leads to a self-contained exposition of basic results on the trigono- 
metric functions. 

Section 1.2 treats first order linear equations, of the form 


(1.0.6) a +a(t)x = b(t), x(to) = 20, 


and produces solutions in terms of the exponential function and integrals. Section 
1.3 considers some nonlinear first order equations, particularly equations for which 
separation of variables allows one to produce a solution, in terms of various integrals. 
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2 1. Single differential equations 


We differ from many introductions in not lingering on the topic of first order 
equations. For example, we do not treat exact equations and integrating factors in 
this chapter. We consider it more important to get on to the study of second order 
equations. In any case, exact equations do get their due, in §4.4 of Chapter 4. 


In $1.4 we take up second order differential equations. We concentrate there on 
two special classes, each allowing for a reduction to first order equations. In §1.5 
we consider differential equations arising from some physical problems for motion 
in one space dimension, making use of Newton’s law F' = ma. The equations that 
arise in this context are amenable to methods of $1.4. In §1.5 we restate these 
methods in terms that celebrate the physical quantities of kinetic and potential 
energy, and the conservation of total energy. Section 1.6 deals with the classical 
pendulum, a close relative of motion on a line. In $1.7 we discuss motion in the 
presence of resistance, including the pendulum with resistance. 


Formulas from §1.6 give rise to complicated integrals, and problems of §1.7 have 
additional complications. These complications arise because of nonlinearities in the 
equations. In §1.8 we discuss linearization of these equations. The associated linear 
differential equations are amenable to explicit analysis. 


Sections 1.9-1.15 are devoted to linear second order differential equations, start- 
ing with constant coefficient equations 


ax dx 
(1.0.7) ro + be +cx = f(t), 
first with f = 0 in §1.9, then allowing f to be nonzero. In §1.10 we consider certain 


special forms of f(t), including 
(1.0.8) e*.  sinot, cosot, t*, 


treating these cases by the method of undetermined coefficients. We discuss impli- 
cations of results here, when f(t) = Asinot, for the forced, linearized pendulum, in 
§1.11. Sections 1.12-1.13 treat other physical problems leading to equations of the 
form (1.0.7), namely spring motion problems and models of certain simple electrical 
circuits, called RLC circuits. In §1.14 we bring up another method, variation of 
parameters, which applies to general functions f in (1.0.7). 


Section 1.15 gives some results on variable coefficient second order linear differ- 
ential equations. Tools brought to bear on these equations include power series rep- 
resentations, extending the power series attack used on (1.0.3), and the Wronskian, 
first introduced in the constant-coefficient context in §1.12. In §1.16 we concentrate 
on a particularly important second-order ODE with variable coefficients, Bessel’s 
equation, further pushing power series techniques and the use of the Wronskian. 
In $1.17 we discuss differential equations of order > 3. In §1.18 we introduce the 
Laplace transform as a tool to treat nonhomogeneous differential equations, such 
as (1.0.7) and higher order variants. Material introduced in §§1.15-1.18 will be 
covered, on a much more general level, in Chapter 3. 


We end this chapter with three appendices. Appendix 1.A explains how Bessel 
functions arise in the search for solutions to some basic partial differential equations. 
Appendix 1.B has some basic material on Euler’s gamma function, of use in §1.16. 
Appendix 1.C establishes that convergent power series can be differentiated term 
by term. We also derive the power series of f(t) = (1—t)~". 


1.1. The exponential and trigonometric functions 3 


1.1. The exponential and trigonometric functions 


We construct the exponential function to solve the differential equation 
dx 
1.1.1 ee, 2 (O) 1s 
(1.1.1) a =n, 2(0) 
We seek a solution as a power series 


(1.1.2) x(t) = Soe 
k=0 


If such a power series converges for t in an interval in R, it can be differentiated 
term-by-term. (See (1.1.45)—(1.1.50) below, and also §1.C, for more on this.) In 
such a case, 


(1.1.3) she 
= So (6+ Vaesit’, 
t=0 
so for (1.1.1) to hold we need 
ak 
(1.1.4) ag=1, Ary = k+l’ 


Le., ay = 1/k!, where k! = k(k —1)---2-1. Thus (1.1.1) is solved by 
a 


(1.1.5) a(t) =e = So ot, teR. 
k=0 


This defines the exponential function e°. 


More generally, we can define 
1 
(1.1.6) e =a z€C. 


The issue of convergence for complex power series is essentially the same as for real 
power series. Given z = x+iy, x,y € R, we have |z| = \/x?4+ y?. If also w € C, 


then |z + w| < |z| +|w| and |zw| = |z|-|w|. Hence 
m+n m+n 
1 k 1 k 
J als Lael 
k=m =m 


The ratio test then shows that the series (1.1.6) is absolutely convergent for all 
z €C, and uniformly convergent for |z| < R, for each R < oo. Note that 


OO Uk 
(7) et = SS tk 
k} 
k=0 
solves 
d 
(1.1.8) ae =ae", 


and this works for each a € C. 


4 1. Single differential equations 


We claim that e* is the only solution to 


dy 
(1.1.9) a YO)=L 
To see this, compute the derivative of e~*y(t): 
d -a -a -a 
(1.1.10) ate *y(t)) = —ae~ y(t) + ec “ay(t) = 0, 


where we use the product rule, (1.1.8) (with a replaced by —a) and (1.1.9). Thus 
e y(t) is independent of t. Evaluating at t = 0 gives 


(1.1.11) e “y(t)=1, VtER, 

whenever y(t) solves (1.1.9). Since e** solves (1.1.9), we have e~*e™ = 1, hence 
1 

(1.1.12) e*=—, VteER, ae. 
ee 


Thus multiplying both sides of (1.1.11) by e* gives the asserted uniqueness: 
(1.1.13) y(t) =e", VtER. 


We can draw further useful conclusions from applying d/dt to products of ex- 
ponential functions. In fact, let a,b € C. Then 
d 
5 [eter tee) 
(1.1.14) = —ae~ Me be elatb)t _ peatebtg(atb)t 4 (q 4 pete bt e(at ot 


=0, 


so again we are differentiating a function that is independent of t. Evaluation at 
t = 0 gives 


1.1.15) e te btelathyt 1) VEER. 
Using (1.1.12), we get 
1.1.16) elatb)t — etebt VtER, abEC, 


or, setting t = 1, 


1.1.17) ett? — et Ya, DEC. 


We next record some properties of exp(t) = e° for real t. The power series 
1.1.5) clearly gives e’ > 0 for t > 0. Since e~* = 1/e’, we see that e! > 0 for all 
t ER. Since de! /dt = e! > 0, the function is monotone increasing in t, and since 
de! /dt? = et > 0, this function is convex. Note that 


1 1 
1.1.18) e Serge 22) 
so e& > 2* A +400 as k > +00. Hence 
1.1.19) Jim e' = +00. 
—>+00 


Since e~* = 1/e!, 


1.1.20) lim e' =0. 


t>—o0o 
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Figure 1.1.1. Exponential function 


As a consequence, 
(1.1.21) exp : R —> (0,00) 


is smooth and one-to-one and onto, with positive derivative, so the inverse function 
theorem of one-variable calculus applies. There is a smooth inverse 


(1.1.22) L:(0,co) > R. 
We call this inverse the natural logarithm: 
(1.1.23) logx = L(x). 


See Figures 1.1.1 and 1.1.2 for graphs of x = e! and t = log z. 
Applying d/dt to 


1.1.24 L(e') =t 
gives 
1 
Iot)ot — Net) = 
1.1.25 L'(e ye’ =1, hence L'(e") = a 
ie., 
d 1 
1.1.26 —logr=-—. 
dx x 
Since log 1 = 0, we get 
* dd 
1.1.27 loga= | %. 
1 Y 


An immediate consequence of (1.1.17) (for a,b € R) is the identity 
1.1.28 logry =logx+logy, x,y € (0,00). 
We move on to a study of e* for purely imaginary z, i.e., of 


1.1.29 y(t) =e", teER. 


6 1. Single differential equations 


t=loge 


Figure 1.1.2. The logarithm 


This traces out a curve in the complex plane, and we want to understand which 
curve it is. Let us set 


1.1.30 e” = c(t) + is(t), 
with c(t) and s(t) real valued. First we calculate |e”|? = c(t)? + s(t)”. For 2,y € R, 
1.1.31 z=at+ty Z=uxu-ty a 
It is elementary that 
1.1.32 ee ee ice 
and z+w=Z74+W. 
Hence 
oo xk 7 
1.1.33 e = ay =e. 
k=0 
In particular, 
1.1.34 teR let | See" = 4. 
Hence t ++ y(t) = e% has image in the unit circle centered at the origin in C. Also 
1.1.35 q(t) = ie” = |y7/()] = 1, 
so 7(t) moves at unit speed on the unit circle. We have 
1.1.36 7(0) =1, (0) =7. 


Thus, for t between 0 and the circumference of the unit circle, the arc from (0) to 
y(t) is an arc on the unit circle, pictured in Figure 1.1.3, of length 


(1.1.37) e(t) = il l7'(s)| ds = t. 


1.1. The exponential and trigonometric functions it 


c(t) + is(t) 


Figure 1.1.3. Behind Euler’s formula 


Standard definitions from trigonometry say that the line segments from 0 to 
1 and from 0 to y(t) meet at angle whose measurement in radians is equal to the 
length of the arc of the unit circle from 1 to y(t), i.e., to €(t). The cosine of this 
angle is defined to be the x-coordinate of y(t) and the sine of the angle is defined 
to be the y-coordinate of 7(t). Hence the computation (1.1.37) gives 


1.1.38 c(t) =cost, s(t) =sint. 
Thus (1.1.30) becomes 

1.1.39 e* = cost + isint, 
which is Euler’s formula. The identity 


Gi tx : 
1.1.40 ae = ie’, 
applied to (1.1.39), yields 
d 
1.1.41 — cost = —sint, sint = cost. 
dt t 


We can use (1.1.17) to derive formulas for sin and cos of the sum of two angles. 
Indeed, comparing 


(1.1.42) (8+) — cos(s +t) + isin(s + t) 
with 

(1.1.43) e'’e"' = (cos s + isins)(cost + isint) 
gives 


cos(s + t) = (cos s)(cost) — (sin s)(sin¢), 


ee sin(s +t) = (sin s)(cost) + (cos s)(sint). 


8 1. Single differential equations 


Returning to basics, we recall that the calculations done so far in this section 
were all predicated on the fact that the power series (1.1.7) can be differentiated 
term by term. This is a special case of a general result about convergent power 
series, established in §1.C. However, making use of the special structure of (1.1.7), 
we include a direct demonstration here. To begin, look at 


(1.1.45) E%(t) = Prue 
k=0 ~ 
which satisfies 
d : apd 
a) =) Goat 
k=1 
-1 
(1.1.46) = 2 att ag 
! 
é=0 
= aEy_;(t) 
Integration gives 
t 
(1.1.47) a : Ee _,(s) ds = E*(t) — 1. 
Now we have 
(1.1.48) a_i (s) > e, E8(t) — e*, 


uniformly on finite intervals, as n + oo, and then the integral estimate 


| i; ‘(E(s) ~ F(s)) ds < pax, [E(s) ~ FC) 


implies 


t t 
(1.1.49) | E*_,(s)ds —> ii e** ds, 
0 0 


as n — oo. Consequently, we can pass to the limit n — oo in (1.1.47) and get 
t 

(1.1.50) af eds = e* — 1. 
0 


Applying d/dt to the left side of (1.1.50) gives ae*’, by the fundamental theorem 
of calculus. Hence this must be the derivative of the right side of (1.1.50), and this 
gives (1.1.8). 


Having the integral formula (1.1.50), we proceed to obtain formulas for f{ ¢"e% dt. 
In fact, from (1.1.46), (1.1.8), and the product rule, we obtain 


d 
' ale Enlt)) = —ae“ E4(t) + ae“ E4_,(t) 

1.1.51 ? 
Jere Qn tre vt 
~ nt : 
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Then the fundamental theorem of calculus gives 


n,—at nl a. —at 
tre * dt = -——_ Ev(tle“+C 


qnti” 

(1.1.52) ' Sas ae 
n! (1+at+ pot Netto 
antl >) ni) oe 


We have an analogous formula for [ t"e” dt, by replacing a by —a. 


fo  —— —_ 
Exercises 


1. As noted, if z = x+ty, z,y ER, then |z| = /x2 + y? is equivalent to |z|? = zz. 
Use this to show that if also w € C, 


|zw| = [2] Jel. 


Note that 
|z+ ul? =(z+w)(Z4+B) 


= |2|? + jul? + wet 2 


= |z|? + |w|? + 2Re zw. 
Show that Re(zw) < |zw|, and use this in concert with an expansion of (|z|+|w])? 


and the first identity above to deduce that 


jz + w| < |z| 4+ lw}. 


2. Define 7 to be the smallest positive number such that e™’ = —1. Show that 
1 v3 


mi/2 _ mi/Z_— — 1 VOR 
e Vee OE at 9! 


Hint. See Figure 1.1.4, showing a = e™*/3. 


3. Show that 
cos?t + sin?t = 1, 
and 
1+ tan? t = sec? t, 
where 
sint 1 
tant = ; sect= R 
cost’ cost 
4. Show that 


di tant = sec?t =1+ tan? ts 


—sect = sect tant. 
dt 
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1/2 


Figure 1.1.4. Hexagon 


5. Evaluate 


Hint. Set x = tant. 


6. Evaluate 


Hint. Set x = sint. 


7. Show that 
1/2 ap 


6 0 J1l— 2 


Hint. Show that sin /6 = 1/2. Use Exercise 2 and the identity e7/6 = eT/2e-7#/8, 


8. Set : * 
cosh t = se +e"), sinht = iG —e*), 
Show that 4 
a cosht = sinht, a sinh t = cosht, 
and 
cosh? t — sinh? t = 1. 
9. Evaluate 


fe dx 
0 JIl+22 
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Hint. Set x = sinht. 


10. Evaluate 5 
| V14+2? dz. 
0 
11. Using Exercise 4, verify that 
d 
oF (sect + tant) = sect(sect + tant), 
d 
ay (sect tant) = sec?t+sect tan? t, 
= 2sec? t — sect. 
12. Next verify that 


es |sect| = tant 
— 10; sec = tant, 
a 8 


a |sect + tant| t 
— log | sec ant| = sect. 
a 

13. Now verify that 


[rontat = log|sect|, 
[ sectat = log|sect + tant|, 


2 sec rat =sect tant + [ sectal, 


(Here, we omit the arbitrary additive constants.) 


14. Here is another approach to the evaluation of { sect dt. We evaluate 


U 
d 
Oe | ae 
0 1+v? 
in two ways. 


(a) Using v = sinh y, show that 


(b) Using v = tant, show that 


tan! u 
I(u) =| sect dt. 
0 


Deduce that 


x 
T 
sectdt =sinh'(tanx), for |a| < 3° 
0 
Deduce from the formula above that also 


x 
cosh( f sec t dt) = secu, 
0 
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and hence that 
x 
exp( sec t dt) = secz + tanz. 
0 


Compare these formulas with the analogue in Exercise 13. 


15. For E(t) as in (1.1.45), k > 1, 0 < T < ow, show that 


LAB) mee Ene) — Pn @)|'S oe (13 a G ae a ‘), 
and that this is 

1.1.54 ot. for n+2 > 2|aT|. 

Deduce that 

1.1.55 a je" — E2(t)| 


satisfies (1.1.54). Show that, for each a, T, (1.1.54) tends to 0 as n - ov, yielding 
the assertion made about convergence in (1.1.48). 


16. Show that 
if eds f Beh Ev(s )ds| < |e Sas |e? — E*(s)|, 


and observe how this, together with Exercise 15, yields (1.1.49). 


17. Show that 


Oo - t2 rh 
(1.1.56) |t] <1> log(1+4) = ee ‘hat 


Hint. Rewrite (1.1.27) as 


expand 


1l+s 
and integrate term by term. 


18. Use (1.1.52) with a = —i to produce formulas for 


2 costdt and ie sint dt. 


19. Figure 1.1.5 (a)—(b) shows graphs of the image of 


y(t) =e, O<t<6n, 
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a. 
ae, 


Figure 1.1.5. Spirals 


for 


Match each value of a to (a) or (b). 


20. Given t > 0 and a € C, we define t* by 


(t= etlost 
Show that, for t > 0, 

d 

—t* =at*! 

dt 


1.2. First order linear equations 
Here we tackle first order linear equations. These are equations of the form 
(1.2.1) —+a(t)x =0(t), x(to) = 20, 


given functions a(t) and b(t), continuous on some interval containing tp. As a 
warm-up, we first treat 


dx 
dt 
with a and b constants. One key to solving (1.2.2) is the identity 


(1.2.2) +ax=b, 2x(0)=20, 


(1.2.3) (e“a) = re + ae), 


which follows by applying the product formula and (1.1.8). Thus, multiplying both 
sides of (1.2.2) by e™ gives 


(1.2.4) 


and then integrating both sides from 0 to t gives 


t 
(1.2.5) e*'x(t) = x0 +f e“*bds. 
0 
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We can carry out the integral, using (1.1.45), and get 


oat 
(1.2.6) e“a(t) = vo + 


b, 
and finally division by e* yields 
b 
z(t) =e “ay + —(1—e “) 
a 


~beem(nn-f) 


a 


(1.2.7) 


In order to tackle (1.2.1), we need a replacement for (1.2.3). To get it, note 
hat if A(t) is differentiable, the chain rule plus (1.1.8) gives 


d 
1.2.8) ae = A A'(t). 
Hence 
1.2.9) (e4z) = Ao (E A'(t)x) 
ay: di a a dt T L)x). 
Thus we can multiply (1.2.1) by e4® and get 
d 
1.2.10) qs) = AMD(E), 
provided 
1.2.11) A'(t) = a(t). 
To arrange this, we can set 
t 
1.2.12) A(t) =a a(s) ds. 
to 
Then we can integrate (1.2.10) from to to t, to get 
t 
(1.2.13) eA x(t) = x9 +f eA()b(s) ds, 
to 
and hence 
t 
(1.2.14) a(t) = eA xy + Eee eA) b(s) ds. 
to 
For example, consider 
dx 
1.2.15 ae? ta = b(t), x(0) = 2. 
From (1.2.12) we get 
2 
1.2.16 A(t) = =o 
and (1.2.10) becomes 
d 2 2 
1.2.17 —(e' 2x) =e /D(t), 


dt 
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hence 
t 
1.2.18 ef? /2e(t) = Xo +f e~*/2b(s) ds. 
0 
Let us look at two special cases. First, 
1.2.19 b(t) =t. 
Then the integral in (1.2.18) is 
t P t?/2 4 
1.2.20 | e* Peds = [ edo =1-e*/?, 
0 0 
The second case is 
1.2.21 b(t) = 1. 
Then the integral in (1.2.18) is 
t 9 
1.2.22 | eW* /2 ds. 
0 
This is not an elementary function, but it can be related to the special function 
1 ft 2 
1.2.23 Erf(t)= —= | e7*/"ds. 
=f. 
Namely, 
1.2.24 : ih ~**/2 ds = Erf(t) — Erf(0) 
22: — |e 5 = Erf(t) — Erf(0). 
V2r Jo 
Note that 
1.2.25 Erf(0) 1 re ) es I 
2 r = rf(oo) = ; 
2 2/20 
where 
co 
ie), Pes r a ea 
—0o 
R2 
Qr oo m 
(1.2.26) - | | e-" /?r dr do 
o Jo 
foe} 
= an f e °ds 
0 


= 27. 


Hence we have 


(1.2.27) Erf(co) =1, Erf(0) = 5: 
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Bernoulli equations 


Equations of the form 
1.2.28) — +a(t)x = d(t)r” 


are called Bernoulli equations. Such an equation is not linear if n 4 1 or 0, but in 
hese cases one gets a linear equation by the substitution 


1.2.29) yaa. 


In fact, (1.2.29) gives y’ = (1—n)a~"2’, and plugging in (1.2.28) gives 


1.2.30) — = (1—n)[b(t) — a(t)yl, 


which is linear. 


Exercises 


Solve the following initial value problems. Do the integrals if you can. 


1. 
“ ee ee he 
2. 
o Pr=t?, 2«(0)=1 
3. 
of +2 = cost, x(0) =0 
4. 
o tie=t, x(0) =1 
5. 
ot ieae', a(0)=1 
6. 
+(tant)z =cost, 2(0)=1. 
7. 
ss (sect)z =cost, 2«(0)=1. 


dt 
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1.3. Separable equations 


A separable differential equation is one for which the method of separation of vari- 
ables, which we introduce in this section, is applicable. We illustrate this with 
another approach to the equation (1.2.2), which we rewrite as 


(1.3.1) — =b-az, x2x(0)=20. 


Separating variables involves moving the x-dependent objects to the left and the 
t-dependent objects to the right, when possible. In case (1.3.1), this is possible; we 
have 

dx 


= dt. 
b-—ax 


(1.3.2) 


We next integrate both sides. A change of variable allows us to use (1.1.27), to 
obtain 


1.3.3 / ae al ad = —*tog|2 ese: 
b—ax aj x—b/a a a 


Hence (1.3.2) yields 


1 b 
1.3.4 —=log|x— -| Se 
a a 
hence 
b 
1.3.5 a(t) — — =+e7%*t0C — Ke“, 
a 


Here K is a constant, which can be found by using the initial condition x(0) = zo. 
We get x9 — b/a = K, so (1.3.5) yields 


(1.3.6) x(t) = , +e7% (xo - ) 


consistent with (1.2.7). 


Generally, a separable differential equation is one that can be put in the form 


1.3. —= t), 
3.7 5 = Sale) 
and then separation of variables gives 
dx 
1.3.8 —— = g(t) dt, 
Fe) 9 
integrating to 
d. 
1.3.9 ee [oo dt. 


f(x) 


Here is another basic example: 


1.3.10) a =27, «(0)=1. 
We get 

d: 
1.3.11) oe ge 
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which integrates to 
1 

(1.3.12) -—=t4+C, 
x 


hence x = —1/(t+C). The initial condition in (1.3.10) gives C = —1, so the 
solution to (1.3.10) is 

1 
1.3.13 x(t) = ——. 
(1.3.18) (0) = 
Note that this solution blows up as t “1. 


The hanging cable 


Suppose a length of cable, lying in the (x, y)-plane, is fastened at (—a,0) and at 
(a,0), and hangs down freely, in equilibrium, as pictured in Figure 1.3.1. The force 
of gravity acts in the direction of the negative y-axis. We want the equation of the 
curve traced out by the cable, which we assume to have length 2 (not stretchable) 
and uniform mass density. 


To tackle this problem, we introduce (x), the angle the tangent to the curve 
at (x, y(a)) makes with the z-axis, which is given by 
(1.3.14) tan 6(x) = y'(2). 

We will derive a differential equation for 0(x), as follows. 

At each point (x, y(x)), there is a tension on the cable, of magnitude T(x), and 
the physical laws governing the behavior of the cable are the following. First, the 
horizontal component of the tension, given by T(x) cos @(zx), is constant. Second, 
the vertical component of the tension, given by T(z) sin @(z), is proportional to the 
weight of the cable lying below y = y(x), hence to the length L(x) of the cable, 
from (0, y(0)) to (x, y(x)). In other words, we have 
T(x) cos O(x) = To, 

T(x) sin 0(x) = KL(a), 
where Tp and « are certain constants (whose quotient will be specified below). As 
for L(x), we have 


(1.3.15) 


L(x) = [ V1t+y'(t)? dt 
= if sec 0(t) dt, 


0 


(1.3.16) 


by (1.3.14) and Exercise 3 of §1.1. 
Taking the quotient of the two identities in (1.3.15) yields 
(1.3.17) tan O(a) = a secO(t)dt, B= oe 
0 To 
Differentiating (1.3.17) with respect to x and using Exercise 4 of §1.1, we get 


(1.3.18) sec” 6(a) a = Bsec O(x), 


1.3. Separable equations 19 


Yo 


Figure 1.3.1. Catenary 


ie., 
1.3.19 we = Pcoosé. 
dx 


We can separate variables here, to obtain 


1.3.20 [secede = | Bac. 


Exercise 14 of §1.1 applies to the integral on the left, and we get 
1.3.21 sec (x) = cosh(Sx + a). 


To yield the expected result 6(0) = 0 (see Figure 1.3.1 again), we set a = 0. 
To get a formula for y(x), use (1.3.14) to write 


(1.3.22) y(x) = yo + tan O(t) dt, yo = y(0). 

0 
Now, by Exercises 3 and 8 of §1.1, together with (1.3.21), we have 
(1.3.23) tan? 6(x) = sec? (a) — 1 = cosh? Ba — 1 = sinh? Bz, 


so (3.22) gives 


y(x) = yo +f sinh Gt dt 
(1.3.24) 7 


ds = «lk 
= yo —- > += cosh Bx. 
Yo B B B 


The graph of such a curve is called a catenary. 


If we are given that the endpoints of the cable are at (ta,0) and that the total 
length is 2L (necessarily L > a), we can recover 6 and yo in (1.3.24), as follows. 
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From (1.3.16) and (1.3.21), 


1 
(1.3.25) L= i cosh St dt = 3 sinh Ba, 
0 
so 2 is uniquely determined by the property that 
sinh L 
(1.3.26) es pee: 
T a a 


Note that h(r) = (sinhr)/r is smooth, h(0) = 1, h’(r) > 0 for 7 > 0, and h(7) 7 
+oo as T 7 +00. Once one has , then the identity y(a) = 0 gives 


1 1 
(1.3.27) Yo = BB cosh Ba. 


Homogeneous equations, separable in new variables 


One can make a change of variable to convert a differential equation of the form 
dx 


1.3.28 —=fit,: 
«= f(t.2) 
to a separable equation when f(t,2) has the following homogeneity property: 
1.3.29 f(rt,re) = f(t,z), VreR\0. 
In such a case, f has the form 
1.3.30 f(t,2) = (=). 
: t 
We can set 
x 
1.3.31 = 
3.3 are 
sox =ty, 2’ =ty’+y, and (1.3.28) turns into 
d = 
1.3.32 dy _ gy) ~¥ 
dt t 


which is separable. 
For example, consider 
de 2x? -? 
(1.3.33) dt net e+e + - 
In this case, (1.3.29) applies, and we can take g(y) = (y?—-1)/(y?+1)+y in (1.3.30), 
so with y as in (1.3.31) we have 


dy l1y?-1 
1.3.34 = 
eS dt ty?4+l’ 
which separates to 
2 dt 
1.3.35 (1 oe ra) dy = = 
To integrate the left side of (1.3.35), write 
2 1 1 
1.3.36 


yeol yl yl’ 
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o get 
2 
a Wy = logly + 1] —log|y — 1 
1.3.37) =" 
SF | yo | 
= 108 ’ 
y—1 
he latter identity by (1.1.28). Thus the solution to (1.3.33) is given implicitly by 
; 2 t 
1.3.38) ee log| | = log |t| +C. 
t Bt 


Exercises 


Solve the following initial value problems. Do the integrals, if you can. 


1. 
dx 
at 241, 2(0)=0 
oD: 
dx a 
He +1, 2«(0)=0 
3. Z 
dx a +l 
dt” t241’ at 
4, ‘ 
iL 
R= - vet x(0) =2 
5. j 
xv Px 
a * (0) =0 
6. 
dx at 
dt x24 42’ ss 


1.4. Second order equations—reducible cases 


Second order differential equations have the form 
(1.4.1) x" = f(t,2,2'), (to) =20, 2"(to) = v0. 


There are some important cases, with special structure, which reduce to first order 
equations for 


(1.4.2) iis 


dt’ 
One such case is 


(1.4.3) wv" = fit,2’), 
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which for v given by (1.4.2) yields 
dv 
1.4.4 oa f(t,v), v(to) = vo. 
Depending on the nature of f(t,v), methods discussed in §§1.2-1.3 might apply to 
1.4.4). Once one has v(t), then 


t 
1.4.5 x(t) = xo +f v(s) ds. 
to 


The following is a more significant special case: 
1.4.6 x” = f(a,2’). 
Direct substitution of v, given by (1.4.2), yields 


1.4.7 — = f(z,v), 


which is not satisfactory, since (1.4.7) contains too many variables. One route to 
success is to rewrite the equation as one for v as a function of x, using 


dv _ dvdr __ dv 
di dudt dv’ 
Substitution into (1.4.7) gives the first order equation 
dv f(z,v) 
dz ov” 
Again, depending on the nature of f(z, v)/v, methods developed in §§2.2—2.3 might 
apply to (1.4.9). 
An important special case of (1.4.6) is 


1.4.10 x” = f(x), 


in which case (1.4.9) becomes 


(1.4.8) 


(1.4.9) 


v(x0) = vo. 


14.11 ay = 1a), 
dx v 

which is separable, 

1.4.12 udu = f(x) dz, 
hence 

1 

1.4.13 5° = g(x) + C, [@ dx = g(x) +C. 

Thus 


1.4.14 a =v =+/29(xr) +2C, 


which in turn is separable, 


1.4.15 


+[/{— =t+0 
Sd 4/2g(x) +2C 7 i 


The constants C' and C2 are determined by the initial conditions. 
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Exercises 


Use v = dx /dt to transform each of the following equations to first order equations, 
either for v = v(t) or for v = v(#), as appropriate. Solve these first order equations, 
if you can. 


1. 
x _ dx 
dt2 ~~ dt 
2: 
dx 7 dx ay 
dt2 dt — 
3. 
ax _ dx 
dt?” dt 
4. 
x dx = 
—~ =—4+9. 
dt? dt 
5. 
CEs. 4 53 
dt2 


1.5. Newton’s equations for motion in one dimension 


Newton’s law for motion in one dimension (1D) of a particle of mass m, subject to 
a force F, is 


(1.5.1) F=ma, 
where a is acceleration, 
du @ax 
(1.5.2) a(t) = a ae 
the rate of change of the velocity 
u(t) = dx/dt. 
In general one might have 
F = F(t,2,2’). 


If F is t-independent, F = F(z,x'), which puts us in the setting of (1.4.6). 
Frequently, one has F = F(x), which puts us in the setting of (1.4.10). We 

revisit this setting, bringing in some more concepts from physics. We set 

(1.5.3) F(x) =—-V"(2). 

V(a), defined up to an additive constant, is called the potential energy. The total 

energy is the sum of the potential energy and the kinetic energy, mv?/2: 


(1.5.4) E= pol t)? + V(x(t)). 
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Note that 
dE 
dt 


mo(t)u'(t) + V’(a(t))«"(t) 
= ma(t)v(t) — F(w(t))o(t) 
=0, 


(1.5.5) 


the last identity by (1.5.1). This identity celebrates energy conservation. Given 
that x solves 


ar : / 
1.5.6 m—~ =-V"(a), x(to) = 20, 2'(to) = vo, 
dt? 
one has from (1.5.5) that for all t, 
1 
1.5.7 gine (t)” + V(ax(t)) = Eo, 
where 
1 
1.5.8 Eo = avi + V(z0). 


The equation (1.5.7) is equivalent to 


15.9 O22.) Vil), 
m 


which separates to 


1.5.10) / = isl es 
Eo oad V(z) m 


or, alternatively, 


(1.5.11) 


ie a fo) 


Note that (1.5.7) and (1.5.10) recover (1.4.13) and (1.4.15). 
Projectile problem 


Let’s look in more detail at a special case, modeling the motion of a projectile 
of mass m traveling directly away from (or toward) the Earth. In such a case, 
Newton’s law of gravity gives 


K K 
(1.5.12) F(a) = mania hence V(x) = ea x € (0,00). 
x 
In such a case, the conserved energy is 


(1.5.13) Eo 


2 (2 a\=2 
2 2 


See Figure 1.5.1 for a sketch of level curves of the function E(x,v). There are three 
cases to consider: 


E(x, v). 


x 


E=-a’ <0, E=0, E=a’>0, ie, 
(1.5.14) nigh fee 
oe emer <0, Eo =0, Ey= 34 > 0. 
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Figure 1.5.1. Projectile paths 


In the first case, x(t) has a maximum at imax = 2K /a?. In the other two cases, 
x(t) + +00 as t + +00 (if v9 > 0) or as t + —oo (if vp < 0). Given zo € (0,00), 
the velocity vo € (0,00) for which E(xo, vo) = 0 is called the escape velocity. 

We investigate the integral on the left side of (1.5.10), i.e 


dx 
(1.5.15) / —————— 
/ Eo + a 
which in the three cases in (1.5.14) is m. times 


cd. di 
(1.5.16) Ff, ee = ie a aie 
V2K x — a?x? V2K2 + a2x? 
respectively. The second integral in (1.5.16) is easy; we investigate how to compute 
the other two, which we rewrite as 
L517 1 xdx 1 xdx (at K 
aa Vakx—2? a) V2kxe +2?" az" 


We can compute these integrals by completing the square: 


1.5.18 x? —2ka = (a—k)?—k?, 2? + 2ke = (a +k)? — k?. 


The respective change of variables y = x — k and y = «+ k turn the integrals in 
1.5.17) into the respective integrals, 


foes (y — k) dy 


1.5.19 


By inspection, 


ewe ua 
1.5.20 —Vk? -y?4+C, = Vy—-k+C. 
[2 — y? Jy — ke — fe 
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The remaining parts of (1.5.19), after a change of variable y = kz, become 


dz dz 
1.5.21 ‘{f SS ‘ f 
( ) V1 — 2? Vz2-1 
To do these integrals, use 


: dz cos s 
z= sins ie [= ds=s+C, 


sinh s 


dz 
z= coshs | aes5- | Gere 


eC 
Exercises 


(1.5.22) 


1. Make calculations analogous to (1.5.12)—(1.5.15) for each of the following forces. 
Examine whether you can do the resulting integrals. 


a) 

F(a) =—-Kex 
b) 

F(x) =—K2? 
c) 4 

F(x) = ore, 
d) 

F(x) =a2-2°. 


2. For such forces as given above, in each case find a potential energy V(x) and 
sketch the level curves in the (x, v)-plane of the energy function 


E(«,v) = se +V(za). 
3. Use the substitution 


to evaluate 


and use 
to evaluate 
/ dx 
[ke 
eet 
Use these calculations as alternatives for evaluating (1.5.15), for Hy < 0 and Ep > 0, 
respectively. 
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Figure 1.6.1. Pendulum 


1.6. The pendulum 


We produce a differential equation to describe the motion of a pendulum, which 
will be modeled by a rigid rod, of length @, suspended at one end. We assume 
the rod has negligible mass, except for an object of mass m at the other end, as 
illustrated in Figure 1.6.1. The rod is held at an angle 6 = 69 from the downward 
pointing vertical, and released at time t = 0, after which it moves because of the 
force of gravity. We seek a differential equation for 6 as a function of t. 


The end with the mass m traces out a path in a plane, which we identify with 
the complex plane, with the origin at the point where the pendulum is suspended, 
and the real axis pointing vertically down. We can write the path as 


1.6.1) 2(t) = bei, 

The velocity is 

1.6.2) v(t) = 2'(t) = ie" (t)e, 

and the acceleration is 

1.6.3) a(t) = u'(t) = e[i0"(t) — 0'(t)7Je), 


The force of gravity on the mass is mg, where g = 32 ft/sec”, provided the pendulum 
is located on the surface of the Earth. The total force F’ on the mass is the sum of 
he gravitational force and the force the rod exerts on the mass to keep it always 
at a distance @ from the origin. The force the rod exerts is parallel to e#™, so 


1.6.4) F(t) = mg + ®(t)e*, 


for some real valued ®(t) (to be determined). We can rewrite mg as 


1.6.5) mg = mge“Oe" = mg{cos A(t) — isin A(t), 
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and hence 

1.6.6 F(t) = [-img sin (t) + mg cos 6(t) + ®(t)]e™. 
Newton’s law F = ma applied to (1.6.3)—(1.6.6) gives 

1.6.7 mid" (t) — 6'(t)?] = —img sin O(t) + (mg cos O(t) + ®(t)). 
Comparing imaginary parts gives 

1.6.8 m0" (t) = —mgsin 0(t), 
or 

1.6.9 a4 + Jsine=0. 


This is the pendulum equation. 
The kinetic energy of this pendulum is 
= me 
2 


and its potential energy (up to an additive constant) is given by —mg times the 
real part of z(t), i-e., 


(1.6.10) smo(t) 6'(t)?, 


(1.6.11) V(6) = —mgé cos 6. 
The total energy is hence 
2 

(1.6.12) B= mone? — mg cos O(t). 
Note that 

dE 2a! vy « U 

— = ml" (t)0" (t) + mgé(sin O(t)) 6’ (t) 
(1.6.13 dt 


= m26'(t) (o") a ; sin (0), 


so the pendulum equation (1.6.9) implies dE/dt = 0, i.e., we have conservation of 
energy. Under the initial condition formulated at the beginning of this section, 


1.6.14 6(0) =, 6'(0) =0, 
we have initial energy 

1.6.15 Eo = —mglcos 6, 
and the energy conservation gives 

1.6.16 E(0,0') = an = Ao, 
where 

1.6.17 E(0,y) =? — 4 cos 6. 


Level curves of this function are depicted in Figure 1.6.2. If @(¢) solves (1.6.9) and 
w(t) = 6'(t), then (6(t), W(£)) traces out a path on one of these level curves. 


Note that 
2 
(1.6.18) VE(0,v) = (+ sin 0, 2), 
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aa 
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LS 
© 

an 


Ce 


ae Ss! 


E=2g/t 


Figure 1.6.2. Level curves of €(0, 1) = ¥? — (2g/£) cos 


so € has critical points at 0 = kx, ~ = 0. The matrix of second order partial 
derivatives of € is 


(1.6.19) D°E(0,-) = Ca ,) , 
(1.6.20) D?E(kr,0) = Gas ) : 


We see that at the critical point (k7,0), € has a local minimum if k is even and a 
saddle-type behavior if & is odd, as illustrated in Figure 1.6.2. 


Note that if the initial condition (1.6.14) holds, then Ap = —(2g/€) cos 4, and 
hence Ag < 2g/€, so the curve traced by (0(t), W(t)) is a closed curve. One might 
instead have initial data of the form 
(1.6.21) 6(0) =%, 6/(0) = vv, 
and one could pick wo so that E(99, wo) > 2g/£. 


We proceed to formulas parallel to (1.5.7)-(1.5.11). Starting from the energy 
conservation (1.6.16), which we rewrite as 


2g 


(1.6.22) 6! (t)? — oF cos O(t) = Ao, 

we have 

(1.6.23) = A +cos6, A wae = 
os Ss el 1 ’ a 2g 0 mgl’ 


which separates and integrates to 


do /2g 
1.6.24 ied t+C. 
( ) / VA; + cosé L 5 
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In the current set-up, where, by (1.6.12), Eo > —mgé, we have 
1.6.25 Ay SA: 


Note that to achieve Aj = —1 requires 6(0) = 0 and 6/(0) = 0, in which case 
1.6.23) yields the initial value problem 


1.6.26 6’ (t) = +4/ 72 V1 cos, 6(0) =0, 
with solution 


1.6.27 6(t) =0. 


In this case (1.6.24) has no meaning. Indeed, if 9 > 0 and one considers 


0 
d 
1.6.28 | ——, 
9 V—-l+cosy 
he integrand is imaginary and furthermore it is not integrable. Nevertheless, 6(t) = 
0 is a solution to the original problem. 


Let us now assume A; > —1. Write 
1.6.29) By =A,+1>0, 


sO 
A; + cos@ = By, — (1 — cos) 


1.6.30 6 
) = B, — 2sin? x 


thanks to the identity cos 2y = cos? y — sin? y = 1 — 2sin” y. We can rewrite the 
left side of (1.6.24) as 


/ do / do 
VA + cos Bi — 2sin? 6/2 


(1.6.31) af ay 

v2 \/1— BP sin? 0/2 
with 

D 
(1.6.32) =z >0. 


The last integral in (1.6.31) is known as an elliptic integral when 6? # 1, i.e., when 
A; #1. Material on such integrals can be found in books that treat elliptic function 
theory, including [47]. 

The case 8 = 1 (ie., Ay = 1, or Eo = mg) does give rise to an elementary 
integral, namely 


/ dé 1 0 0 
= sec 
(1.6.33) Vitcos@ v2 : 
0 
a en 9 
= V2sinh (tan =) +C, 


for |6| < 7, the latter identity by Exercise 14 of §1.1. 
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Further study of the elliptic integral in (1.6.24) 


Let us pursue the computations arising from (1.6.24) in more detail, taking the 
initial condition 


1.6.34 6(0)=0, 6'(0)=vWo, wo € (0,00). 
Then (1.6.24) yields, for the solution 0(t), 
a(t 
1.6.35 i He a ea ee 
0 V/A; + cos? L 
In such a case, 
_ Fo _ £9 _ fe 
1.6.36 Aj = mee > 39% 1, hence By = 3g?" 
Then (1.6.31) yields 
i i) V2 (29, 
(1.6.37) 0 /1—prsin?oj2 & Y 
= vot, 
with 
2 2 /g 
1.6.38 =4/>- = 4) 5- 
( ) # V Bi vo i3 
Let us specialize (1.6.37) to 
1.6.39 B=1, hence yo = 2/8, so Eo = mgé. 
By (1.6.33), we get 
At). vo s g 
1.6.40 tan iad sinh 5) t = sinh 7 t, 
or 
1.6.41 6(t) = 2tan7! sinh (Se 
Applying d/dt yields 
1 
1.6.42 é'(t) = v(t) = 2/8 : 
ae) £ cosh Jit 
In this case, 
1.6.43 A(t) > +n and y(t) 0, as t— +00. 


The curve (0(t),w(t)) and its mirror image are called separatrices. These curves 
separate bounded periodic solutions from unbounded solution curves. 


We turn to the case 


(1.6.44) B>1, hence 0<yo< aff, so E < mgt. 
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In this case, 
(1.6.45) A(t) is periodic, say of period II(wo), 


and we want to find a formula for II(wo). Looking at Figure 1.6.2, we see that 


1 
(1.6.46) w(t) =0 at t= qi»): 
Comparison with the formula 
= B, — 2sin? ° 
(1.6.47) 7 
= 2 gi 2 
ga 6? sin 5 
gives 
6 B 
(1.6.48) w =0 when sin? = = —, 
2 2 
and hence 
1 ps ie dd 
qivo) = 2 ’ 
(1.6.49) 90 4/B, — 2sin? 0/2 
oe UL S - aa Gp 
sin 5 5 Bay Wo. 
Equivalently, 
or his dd 
Ti(vo) = = f : 
0 1 — 6? sin* 0/2 
(1.6.50) pace: 
9 por/2 de 
Yo Jo 1— 62 sin? y 
with 9; as in (1.6.49). Making the change of variable x = sin y, we get 
1 ae si da 
(1.6.51) M(vo) = . 
4 vo Jo (1 — 27)(1 — 622?) 


and finally, setting y = Gx yields 


T(o) = = Ja —# aa 
(1.6.52) : : 
_ fe ly 
Eg V0 = y)0 = ay?) 
with 


1. /B, 1 /é 
(1.6.53) naa 5 ae 
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so0<a< 1. Clearly, a > 0 when wo — 0, so we have 


(1.6.54) 


This coincides with the period of solutions to 
(1.6.55) —,~ +=0=0, 


which we will identify in §1.8 with the linearization of the pendulum equation about 
the zero solution. 


Finally, we examine the case 


2 
(1.6.56) 0<6B<1, hence y= ss = 2/2. so E > mgé. 


In such a case, we see from (1.6.47) that 6(t) is monotone in t. However, it does 
possess the “periodicity” 


(1.6.57) O(t +s) =O(t) +27, with s =II(yWo) 


where, when (1.6.56) holds, 


To) ¢ ie dd 
ees 
2 2g Jo VAi + cosd 


(1.6.58) 1 i: di 
Yo Jo 4/1 — g2 sin? 8/2 


_ 2 | is d 
vo Jo 1— Bsin2p 
Making the change of variable x = sin y, we get 


dx 
(l= 2?)(1— B72?) 


1: 
(1.6.59) IL(wo) = : 


REMARK. The integrals in (1.6.52) and (1.6.59) are called complete elliptic integrals. 
One can expand these integrals in convergent power series in a? and 8, respectively, 
using the formula 


1 loo} 7 
= So au , for |u| <1, with 
(1.6.60) MESES 


jean ax = (1 (2 s)e (i 5) 
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(see Apendix 1.C), with u = a?2? in (1.6.52) and u = 6?2? in (1.6.59), and then 
integrating term by term. The coefficients in the resulting power series involve 


sin?’ y dy 


1 2k 
x 
i: ————_ dr = 
o Vl— 2x2 


1/1\% f2™,, 2 
1.6.61 a ip _ pot 
een ao) iE (er ae 


2k 
= g-2k-1 . 
a ji 


One can also express these complete elliptic integrals in terms of a function known 
as the Gauss arithmetic-geometric mean (cf. [47], Chapter 6, §4). 


n/2 
0 
1 


ee 
Exercises 


1. Let E be given by (6.8). Show that if @(t) solves (6.6) and |6(t)| < 2/2 for all t, 
then E < 0. 


2. Show that the level set in Figure 1.6.2 where € = 2g9/¢ (ie., E = mg@) is given 


by 
ee ees 
y= 4224/4 cos 5° 


3. By (1.6.3), the component of acceleration parallel to e!? is —€6’(t)?e#. Com- 
pute the component of the gravitational force parallel to e8, and deduce that the 
force the rod exerts on the mass to keep it always at a distance ¢ from the origin is 
Ge, with 


& = —m6'(t)? — mg cos9. 
Deduce that, with EF as in (1.6.12), 
E 3mé 


&(t) = z= = F(t). 


4. Apply the change of variable s = sin y to the last integral in (1.6.31), ie., to 


i fepan 
1 — 62 sin? p 


Show that the integral becomes 


/ ds 
(—s\(1 — 4) 


Specialize to 6 = 1 and obtain an alternative derivation of the formula for 


/ sec y dy, 
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given in Exercise 13 of §1.1. 


5. Suppose the mass at the end of the pendulum has a charge q; and there is a 
charge q2 fixed at (x,y) = (2€,0). Then the force F(t) is modified to 

26 — £619) 
|2¢ _ Cet9(t)/3 
where K is a positive constant. Use this to produce a modification of the pendulum 
equation. 


F(t)=mg—-Kuqg + H(t), 


1.7. Motion with resistance 


In many real cases, the force acting on a moving object is the sum of a force 
associated with a potential and a resistance, typically depending on the velocity 
and acting to slow the motion down. For example, the motion of a ball of mass 
m falling through the air near the surface of the Earth can be modeled by the 
differential equation 


ax dx 
mag = mg —as 


where the x-axis points down toward the Earth. Here g = 32 ft/sec? and a is an 
experimentally determined constant, depending on the size of the ball, and measures 
air resistance. We can rewrite (1.7.1) as an equation for v = dx/dt, 


(1.2.4) 


d é 
(1.7.2) ed - 
an equation that is both linear and separable. Unless v is small, the formula —av 


for the force of air resistance is not so accurate, and a more accurate equation might 
be 


(1.7.3) el pert aes 


This is not linear, but it is separable. For v close to the speed of sound in air, even 
this model loses validity. 


If the ball is falling from the stratosphere toward the surface of the Earth, the 
variation in air density, hence in air resistance, must be taken into account. One 
might replace the model (1.7.1) by 


2 
(1.7.4) me ees mg a(n). 


The method of (1.4.6)—(1.4.9) is applicable here, yielding for v = dx/dt the equation 


This, however, is not typically amenable to a solution in terms of elementary func- 
tions. 


d 
(1.7.5) a. 


dx ov 


Another example of motion with resistance arises in the pendulum. Between 
air resistance and friction where the rod is attached, the pendulum equation (1.6.9) 
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might be modified to the following damped pendulum equation: 


a0 add og. 
(1.7.6) w@tmate siné = 0, 
for some positive constant a. Again the method of (1.4.6)—(1.4.9) is applicable, 


and it yields for y = d@/dt the equation 


db a gsiné 
dd m £ yw 

However, this equation is not particularly tractable, and does not yield much insight 
into the behavior of solutions to (1.7.6). 


(1.7.7) 


ST) 
Exercises 


1. Suppose v(t) solves (1.7.2) and v(0) = 0. Show that 


lim v(t) = Lee 
t4+00 Qa 
and a 
v(t) < Vt € [0, 00). 


What does it mean to call mg/a the terminal velocity? 
2. Do the analogue of Exercise 1 when v(t) solves (1.7.3) and v(0) = 0. 


3. In the setting of Exercise 1, what happens if, instead of v(0) = 0, we have 


v(0) = v9 > iS 
a 


4. Apply the method of separation of variables to (1.7.3). Note that 
Qa B . 
g— Xv — 203 = piv) 
m m 
has three complex roots (at least one of which must be real). For what values of 


a, 8, and m does p(v) have one real root and for what values does it have three real 
roots? How does this bear on the behavior of 


/ dv 2 

p(v) 

5. More general models for motion with resistance involve the following modification 
of (1.5.6): 


Parallel to (1.5.4), set 
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Show that 
dE 
sore 
dt ~ 
One says energy is dissipated, due to the resistance. 


0. 


1.8. Linearization 


As we have seen, some equations, such as the pendulum equation (1.6.9), which we 
rewrite here as 


x Ge: 
We + pone =0, 


can be “solved” in terms of an integral, in this case (1.6.24), i.e., 


(1.8.1) 


(1.8.2) 


/ dx 
VA; +cosx V & 
However, the integral is a complicated special function. By contrast, other equa- 


tions, such as the damped pendulum equation (1.7.6), which we rewrite 


de adr g 
1.8. sinc = 
(1.8.3) 7 yaa + 7sing 0, 


are not even amenable to solutions as “explicit” as (1.8.2). In such cases one might 
nevertheless gain valuable insight into solutions that are small perturbations of 
some known particular solution to (1.8.1) or (1.8.3), or more generally 


(1.8.4) v(t) = f(t, x(t), 2’(t)). 


In case (1.8.1) and (1.8.3), x(t) = 0 is a solution. More generally, one might have 
a known solution y(t) of (1.8.4); i-e., y(t) is known and satisfies 


1.8.5 y(t) = f(t,y(), y'(d). 


Now take a(t) = y(t) + eu(t). We derive an equation for u(t) so that x(t) satisfies 
1.8.4), at least up to O(e?), ice., 


1.8.6 y!(t) teu (t) = f(t, y(t) + cult), y/(d) + eu'(t)) + Ole”). 
To get this equation, write, with f = f(t,z,v), 
/ ! , of , of 1), 2 
18.7) f(tyteuy! teu’) = f(tyy!) +e(S(ty.y! ut 5 (tysy' Ju’) + Ole), 
et Ov 


the first order Taylor polynomial approximation. Plugging this into (1.8.6) and 
using (1.8.5), we see that (1.8.6) holds provided u(t) satisfies the equation 


(1.8.8 u(t) = A(t)u(t) + B(t)u'(d), 
where 
(1.8.9 A) =Zewoto), Bw =Zewov@). 


The equation (1.8.8) is a linear equation, called the linearization of (1.8.4) about 
the solution y(t). 
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In case (1.8.1), f(¢,v,v) = —(g/2) sina, and the linearization about y(t) = 0 of 
this equation is 
Mug 
In case (1.8.3), f(t, 2, v) = (a/m)v+(g/£) sin, and the linearization about y(t) = 0 
of this equation is 


@u adug 


(1.8.11) a an ae pe 9. 
To take another example, consider 

(1.8.12) x(t) = ta(t) — a(t). 

One solution is 

(1.8.13) y(t) =t. 


In this case we have (1.8.4) with f(t,x2,v) =ta—2?, hence f,(t,a,v) =t—2z and 
fo(t,v,v) =0. Then f,(t,y,y’) = fr (t,t, 1) = —t, and the linearization of (1.8.12) 
about y(t) = t is 


(1.8.14) u(t) + tu(t) = 0. 


ERE 
Exercises 


Compute the linearizations of the following equations, about the given solution y(t). 


1. 
xv" +coshx—coshl1=0, y(t) =1. 
2: 
xv" +cosha—cosht=0, y(t) =t. 
3. 
ve’ +a'sinzx=0, y(t) =0. 
4. 
Wt ts T 
zc’ +e2'snzx=0, y(t)= 3 
5. 


ve’ +sinz=0, y(t)=7. 


1.9. Second order constant-coefficient linear equations-homogeneous 


Here we look into solving differential equations of the form 
Laren, 
a— —+cr= 
dt? dt F 
with constants a, b, and c. We assume a # 0. We impose an initial condition, such 
as 


(1.9.2) x2(0)=a, 2'(0)= 8. 


(1.9.1) 
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We look for solutions in the form 
(1.9.3) z(t) =e", 
for some constant r, which worked so well for first order equations in §1.1. By results 
derived there, if x(t) has the form (1.9.3), then x’(t) = re” and x(t) = r?e", so 
substitution into the left side of (1.9.1) gives 
(1.9.4) (ar? + br + ce", 
which vanishes if and only if r satisfies the equation 
(1.9.5) ar? + br +e=0. 
The polynomial p(r) = ar? +br +c is called the characteristic polynomial associated 
with the differential equation (1.9.1). Its roots are given by 

b 


1 
= ef 2 
(1.9.6) T+ = a 2 b 4ac. 


There are two cases to consider: 
(I) b?—4ac 40, 
(I)  b? —4dac = 0. 


In Case I, the equation (1.9.5) has two distinct roots, and we get two distinct 
solutions to (1.9.1), e"+’ and e"-". It is easy to see that whenever x(t) and x(t) 
solve (1.9.1), so does C,x1(t) + Coxr2(t), for arbitrary constants C; and C2. Hence 


(1.9.7) x(t) =Cye™*! +C_e™* 
solves (1.9.1), for all constants C, and C_. 


Having this, we can find a solution to (1.9.1) with initial data (1.9.2) as follows. 
Taking x(t) as in (1.9.7), so a’(t) = Cyrye"*' + C_r_e"-*, we set t = 0 to obtain 


(1.9.8) x(0)=Cy+C_, 2'(0)=r4Cp+r_C_, 
so (1.9.2) holds if and only if Cy and C_ satisfy 

Ci +C_=a, 
(1.9.9) ag 


re Cz + ae oe => B. 


This set of two linear equations for C+ and C_ has a unique solution if and only if 
r, #r_. In fact, the first equation in (1.9.9) gives 


(1.9.10) roCy + r_G_ = Ta, 
and subtracting this from the second equation in (1.9.9) yields 
(1.9.11) Qa 

Th-TH 


and then the first equation in (1.9.9) yields 


(1.9.12) CiaW Gia 
des oa 
In Case II, r = —b/2a is a double root of the characteristic polynomial, and we 


have the solution x(t) = e” to (1.9.1). We claim there is another solution to (1.9.1) 
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that is not simply a constant multiple of this one. We look for a second solution in 
the form 


(1.9.13) x(t) =u(t)e™, 
hoping to get a simpler differential equation for u(t). Note that then x’ = (u’ + 
ruje™ and w” = (u” + 2ru’ + r2u)e™, and hence 


ax” + be! + cx = {a(u" + 2ru’ + ru) + bu’ + ru) + cube” 


(1.9.14) = {au" + (Qar + b)u! + (ar? + br + cjubert 


= aue"™, 


given that (1.9.5) holds with r = —b/2a. Thus the vanishing of (1.9.14) is equivalent 
to u”(t) = 0, ie., to u(t) = Cy + Cot. Hence another solution to (1.9.1) in this case 
is te™, and, in place of (1.9.7), we have solutions 

(1.9.15) a(t) = Cye"™ + Cote”, 

for all constants C, and C2. 


We can then find a solution to (1.9.1) with initial data (1.9.2) as follows. Taking 
x(t) as in (1.9.15), so a’(t) = Cyre™ + Corte™ + Cre", we set t = 0 to obtain 


1.9.16) z(0)=C1, «w(0)=C1+Cr, 
so (1.9.2) is satisfied if and only if C; and C2 satisfy 

1.9.17) Cy=a, C1 +C,=8, 
i.e., if and only if 

1.9.18) Ciy=a, Co=fh-a. 


We claim that the constructions given above provide all of the solutions to 
1.9.1), in the two respective cases. To see this, let x(t) be any solution to (1.9.1), 
let r = r+ (which equals r_ in Case IT), and consider u(t) = e~" a(t), as in (1.9.18). 
The computation (1.9.14) holds ifr; = r_, and ifr, A r_ we get 


1.9.19) ax” + ba! + cx = {au" + (2ar + bu’ bert. 


As we have seen, when r; = r_ this forces u(t) = 0, which hence forces u(t) to 
have the form C; + Ct for some constants C;, and hence a(t) = Cye™ + Cote™. 
When ry # r_, vanishing of (1.9.19) forces 

1.9.20 av' + (Qar+b)v=0, with v=w', 

which, by results of §1.1, forces 

v(t) = Koe@rt8/Ot hence 

1.9.21 
u(t) = Ky + Kye" 2rt/, 


for some constants Ko, Ky, and K2. This in turn implies 


1.9.22 a(t) = Kye" + Kye @ 40/008, 
But (1.9.6) gives r, + r_ = —b/a, hence 

b 
1.9.23 rary (r -) = 
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so (1.9.22) is indeed of the form (1.9.7), with Cy = ky and C_ = Ko. 


The arguments given above show that indeed all solutions to (1.9.1) have the 
form (1.9.7) or (1.9.15), in Cases I and I, respectively. We say that (1.9.7) (in Case 
I) and (1.9.15) (in Case IT) provide the general solution to (1.9.1). This analysis of 
the general solutions together with the computations giving (1.9.12) and (1.9.18), 
establish the following. 


Theorem 1.9.1. Given a, b, and c, with a 4 0, and given a and £, the initial 
value problem (1.9.1)-(1.9.2) has a unique solution x(t). In Case I, x(t) has the 
form (1.9.7), and in Case I, it has the form (1.9.15). 


REMARK. The uniqueness proof given above uses the same principle as that for the 
first order equation dx/dt = ax in §1.1, but the details here are more elaborate. 
In 83.1, we will give a uniqueness proof for first order, constant coefficient linear 
systems that looks almost exactly like the argument in §1.1. 


The results derived above apply whether a, b, and c are real or not. If we 
assume they are real, then Case I naturally divides into two subcases, 


(IA) b? — dac > 0, 
(IB) _b? — 4ac < 0. 


In Case IA, the roots of the characteristic equation (1.9.5) given by (1.9.6) are real. 
In Case IB, we have complex roots, of the form 


b 1 
1.9.24 rh=rctio, r ; 0 dac — b?. 
2a 2a 
Hence the solutions (1.9.7) have the form 
1.9.25 a(t) = Cy24(t)+C_x_(t), va (t) = eH", 
From §1 we have e("+'7)! = e"te*!** and also 
1.9.26 e*'7! — cosot + isinot. 
Hence 
1.9.27 x4(t) =e" (cosot tisinot). 


In particular, the following are also solutions to (1.9.1): 


ay(t) = (4 (t) +a_(t)) =e" cosot, 
(1.9.28) : 
t — 
malt) = 3 
We can hence rewrite (1.9.25) as x(t) = Ca1(t) + Cox2(t), or equivalently 
(1.9.29) x(t) = Cie™ cosot + Ce" sin ot, 


(x4(t) — x_(t)) =e" sinot. 


for some constants C; and C2, related to Cy and C_ by 
(1.9.30) Cy =Cy+C_, Co =i(Cy,—-C_). 


We can combine these relations with (1.9.11)-(1.9.12) to solve the initial value 
problem (1.9.1)—(1.9.2). 
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We now apply the methods just developed to the linearized pendulum and 
damped pendulum equations (1.8.10) and (1.8.11), ie., 


Mug 
and 
du adu. g 
1.9.32 qe + mde? oe 0. 
Here, g,@,a, and m are all > 0. Let us set 
1.9.33 aah. bes 
£ m 
so b > 0,k > 0, and the equations (1.9.31)—(1.9.32) become 
@u 2 
and 
@u du 2 


The characteristic equation for (1.9.34) is r? +k? = 0, with roots r = tik. The 
general solution to (1.9.34) can hence be written either as u(t) = Cye"*’ +C_e~™ 
or as 


(1.9.36) u(t) = C) cos kt + C2 sin kt. 
The resulting motion is oscillatory motion, with period 27/k. 


The characteristic equation for (1.9.35) is r? + br + k? = 0, with roots 


(1.9.37) ee ee 


Dn 2D. 


There are three cases to consider: 
IB) b? — 4k? < 0, 
(I) b? — 4k? = 0, 
(IA) b? — 4k? > 0. 


In Case IB, say b? — 4k? = —4k?. Then rz = —(b/2) + ik, and the general solution 
to (1.9.35) has the form 


(1.9.38) u(t) = Cre"? cos xt + Coe **/? sin Kt. 


These decay exponentially as t 7 +00. This is damped oscillatory motion. The 
oscillatory factors have period 


2 2 
(1.9.39) a= Z 


RJR OP 
which approaches 00 as 6 / 2k. 
In Case IA, say 8 = Vb? — 4k?, so rz = (—b + 8)/2. Note that 0 < 8 < b, so 
both r; and r_ are negative. The general solution to (1.9.35) then has the form 


(1.9.40) ult) = Creo 4/2 4 Cye(—P-Pt/2, b+ B <0. 
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These decay without oscillation as t 7 +00. One says this motion is overdamped. 
In Case II, the characteristic equation for (1.9.35) has the double root —b/2, and 
the general solution to (1.9.35) has the form 


(1.9.41) u(t) = Cre? + Cote. 
These also decay without oscillation as t 7 +00. One says this motion is critically 
damped. 


The nonlinear damped pendulum equation (1.7.6) can also be shown to manifest 
these damped oscillatory, critically damped, and overdamped behaviors. 


ee 
Exercises 


1. Find the general solution to each of the following equations for x = x(t). 


xv’ +252 =0. 
xe” — 252 = 0. 


a” —2' +2 =0. 


a’ 4+2¢'+2=0. 


w'+a'+x2=0. 


2. In each case (a)—(e) of Exercise 1, find the solution satisfying the initial condiiton 


x(0)=1, 2'(0)=0. 


3. In each case (a)—(e) of Exercise 1, find the solution satisfying the initial condition 


x(0)=0, (0) =1. 


4. For ce £0, solve the initial value problem 
av! — 2%! +(1—e")a-=0, 2x-(0)=0, 2£(0) =1. 
Compute the limit 


x(t) = lim z,(t), 
e70 
and show that the limit solves 


vw" —Ia'+x2=0, x(0)=0, 2'(0)=1. 
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1.10. Nonhomogeneous equations I|—undetermined coefficients 


We study nonhomogeneous, second order, constant coefficient linear equations, that 
is to say, equations of the form 


(1.10.1) ee Lapeer 


with constants a,b, and c (a 4 0) and a given function f(t). The equation (1.10.1) 
is called nonhomogeneous whenever f(t) is not identically 0. We might impose 
initial conditions, like 


(1.10.2) x2(0)=a, 2'(0)= 8. 


In this section we assume f(t) is a constant multiple of one of the following functions, 
or perhaps a finite sum of such functions: 


(1.10.3) ent, 
(1.10.4) sinat, 
(1.10.5) cos at, 
(1.10.6) tf 


We discuss a method, called the method of undetermined coefficients, to solve 
(1.10.1) in such cases. In §1.14 we will discuss a method that applies to a broader 
class of functions f. 

We begin with the case (1.10.3). The first strategy is to seek a solution in the 
form 


(1.10.7) a(t) = Ae™. 


Here A is the undetermined coefficient. The goal will be to determine it. Plugging 
(1.10.7) into the left side of (1.10.1) gives 


(1.10.8) ax” + ba! + cx = A(an? +b +c)e™. 
As long as « is not a root of the characteristic polynomial p(r) = ar? + br +c, we 
get a solution to (1.10.1) in the form (1.10.7), with 

1 


1.10.9 A= —>—___.. 
) ak? +bK +c 


In such a case, the equation 


x | dx 
1.10.10) a + = ih sige Apt 


has a solution, 
1.10.11) xp(t) = ABe™, 


with A given by (1.10.9). We say x,(t) is a particular solution to (1.10.10). If a(¢) 
is another solution, then, because the equation is linear, y(t) = x(t) — xp(t) solves 
the homogeneous equation 


@y od 
a pele 


(1.10.12) Oa tO ag 


+cy =0, 
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which was treated in §1.9. If, for example, p(r) has distinct roots r+ and r_, we 
know the general solution of (1.10.11) is 


(1.10.13) y(t) = Cye™+* + C_e"-*. 
Then the general solution to (1.10.10) is 

B : , : 
(1.10.14) x(t) = et + Cre"? + C_e™*. 


ak? + be +e 
In (1.10.14), a,b,c, B, and « are given by (1.10.10), and Cy and C_ are arbitrary 
constants. If the initial conditions in (1.10.2) are imposed, they will determine C+ 
and C_. If ry; and r_ are complex, we could rewrite (1.10.13)—(1.10.14), using 
Euler’s formula, as in §1.9. 

Formulas (1.10.11)—(1.10.14) hold under the hypothesis that r,,r_, and « are 
all distinct. If the characteristic polynomial has a double root r = r_ = r_, distinct 
from «, then we replace (1.10.13) by 


(1.10.15) y(t) = Cre" + Cate”, 
and the general solution to (10.10) has the form 

B Kt rt rt 
(1.10.16) x(t) = Fr ery EST +Cye™ + Cote™. 


Again, the initial conditions (1.10.2) would determine C) and C2. 


We turn to the case that « is a root of the characteristic polynomial p(r). In 
such a case, (1.10.8) vanishes, and there is not a solution to (1.10.1) in the form 
(1.10.7). This study splits into two cases. First assume p(r) has distinct roots. Say 
K=rz4A#r_. Then (1.10.1) (with f(t) = e**) will have a solution of the form 


(1.10.17 x(t) = Ate™. 
Indeed, a computation parallel to (1.9.14), with u(t) = At, r =k, gives 
(1.10.18 ax” + be’ + cx = (2aKn +b)Ae™, 


since in this case u” = 0 and ak? +bk +c = 0. Then (1.10.1) holds with f(t) = e™, 
provide 


1 
1.10.19 A=-——., 
2aKk +b 
and more generally a particular solution to (1.10.10) is given by 
1.10.20 tp(t) = ABte™, 


with A given by (1.10.19). As above, the general solution to (1.10.10) then has the 
form 
1.10.21 x(t) = 2,(t) + y(4), 

where y(t) solves (1.10.12), hence has the form (1.10.13) (given r+ 4 r_). 


To finish the analysis of (1.10.10), it remains to consider the case K = ry =r_. 
Then functions of the form (1.10.15) (with r = «) solve (1.10.12), so there is not a 
solution to (1.10.1) (with f(t) =e’) of the form (1.10.17). Instead, we will find a 
solution of the form 


(1.10.22) a(t) = At?e™. 
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In this case, a computation parallel to (1.9.14), with u(t) = Al?, r =k, gives 
(1.10.23) ax” + be’ + cx = 2aAe™, 


since in this case u” = 2A, 2an +b =0, and ax? + bk +c=0. Then (1.10.1) holds 
with f(t) = e* provided 


1 
1.10.24 = —, 
) ee 
and more generally a particular solution to (1.10.10) is given by 
1.10.25) z,(t) = ABtre™, 


with A given by (1.10.24). Then the general solution to (1.10.10) has the form 
1.10.21), where y(t) solves (1.10.12), hence has the form (1.10.15), with r = k. 
Recall we are assuming r+ = r_.) 


As a slight extension of (1.10.10), consider the equation 


ax dx 5 d 
1.10.26) aa toy ter = Bye" + Boe 
This has a solution of the form 
1.10.27) Lp(t) = &pi(t) + Lp2(t), 


where x,;(t) are particular solutions of (1.10.10), with B replaced by B; and « 
replaced by «;. Then the general solution to (1.10.26) has the form (1.10.21), with 
a,(t) given by (1.10.27) and y(t) solving (1.10.12). 
We move on to cases of f(t) given by (1.10.4) and (1.10.5), which we combine 

as follows: 

d? dx 
(1.10.28) Ge eee b, sinot + by cosat. 

dt? 
Via Euler’s formula we can write 


b; sinot + bo cosot = Bye’ + Boe”, 


(1.10.29) bi bg bi bg 
B= gg ay © ee gy so 
and we are back in the setting (1.10.26), with kK) = io, K2 = —io. Thus, for 


example, if +io are not roots of the characteristic polynomial p(r) = ar? + br +, 
we have a particular solution of the form 


(1.10.30) rp(t) = Ay Bye’! + AxBoe*, 


where B, and By are as in (1.10.29) and the undetermined coefficients A; and Ag 
can be obtained by plugging into (1.10.28). As an alternative presentation, we can 
again use Euler’s formula to rewrite (1.10.30) as 


(1.10.31) L(t) = a, sinot + ag cosot, 


where the undetermined coefficients a; and a2 are obtained by plugging into (1.10.28). 


If a,b, and c in (1.10.1) are all real, then p(r) will not have purely imaginary 
roots if b #0. If b = 0, the roots will be rz = +,/—c/a, which are real if c/a < 0 
and purely imaginary if c/a > 0. In case rz = io, considerations parallel to 


1.10. Nonhomogeneous equations I-undetermined coefficients 47 


(1.10.17)—(1.10.20) apply, with « = 4 
formula gives 


tio. Again a further application of Euler’s 


(1.10.32) x(t) = aytsinot + agtcosot, 


where the coefficients a; and ag are obtained by plugging into (1.10.28). 


We now move to cases of f(t) given by (1.10.6). Take k = 1, so we are looking 


at 

1.10.33 ax" + ba’ + cx =t. 
We try 

1.10.34 a(t) = At+B, 


condition that (1.10.33) hold is 


for v = dx/dt. We try 
1.10.38 v(t) 


1.10.37) hold is 


for which x’ = A, x” = 0, and the left side of (1.10.33) is cAt + (B+ 6A). The 


1.10.35 cA=1, B+bA=0, 

solved by 

1.10.36 A= a B= af 
c c 


assuming c # 0. If c = 0, we want to solve 


1.10.37 av +bv=t 


=at+ B, 


for which v’ = a and the left side of (1.10.37) is aa+b(at + 8). The condition that 


1.10.39 ba=1, aa+bp=0, 
solved by 
1 a 
1.10.40 a= 5, B=—-p, 
assuming b # 0. In such a case, we can take 
1.10.41 a(t) = sf + Bt. 
In case c = b = 0, (1.10.32) becomes 
1.10.42 az” =t, 
with solution 
ile 
1.10.4 z(t) = —#. 
0.43 x(t) 6a 


Analogous considerations apply to (1.10.6) with k > 2. The method can also 


be extended to treat f(t) in the form 


(1.10.44) tke t® sinot, t* cosot. 


We omit details. In such cases, it is just as convenient to use the method developed 


in §1.14. 
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See $1.16 for further insight on why the method of undetermined coefficients 
works for functions f(t) of the form (1.10.3)—(1.10.6), and more generally of the 
form (1.10.44). 


| 
Exercises 


1. Find the general solution to each of the following equations for x = x(t). 


a) 


ov" + 25a =e. 


ov" — 25a =e. 


ev” —2e4+2=sint. 


ov" +22’ +a =e. 


uw - 
xv +a” +2x2=cost. 


2. In each case (a)—(e) of Exercise 1, find the solution satisfying the initial conditions 


3. In each case (a)—(e) of Exercises 1, find the solution satisfying the initial condi- 
tions 


4. For e £0, solve the initial value problem 
a! — er, = eOt" (0) =1, 24(0) =0. 


Compute the limit 
x(t) = lim z,(t), 


e>0 


and show that the limit solves 
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1.11. Forced pendulum—resonance 


Here we study the following special cases of (1.10.28), modeling the linearized 
pendulum and damped pendulum, respectively, subjected to an additional periodic 
force of the form Fo sinot. The equations we consider are, respectively, 


a 
(1.11.1) Gp t gu = Fosinot, 
and 
Py 
(1.11.2) BE OD 8 ea. 


dt? mdt = ¢ 
The quantities a,m,g, and £ are all positive, and we take Fo and o to be real. As 
in (1.9.33), we set 


1.11.3 eee ae at ay 

£ m 
so b> 0, k > 0, and the equations (1.11.1)—(1.11.2) become 

d2 
1.11.4 ez + ku = Fosinot, 
and 
d? d 

1.11.5 ez + b> + ku = Fosinot. 


As long as k 4 +o, we can set u(t) = a,sinot and the left side of (1.11.4) 
equals a1(k? — o?) sinat, so a solution to (1.11.4) is 


BG ots 
(1.11.6) Up(t) = Be  g2 sinat, 


in such a case. Note how the coefficient Fp /(k? —c) blows up as 0 > +k. Ifo =k, 
then, as in (1.10.32), we need to seek a solution to (1.11.4) of the form 


1.11.7) up(t) = aytsin ot + agt cos ot. 
In such a case, 

1.11.8) Uy + kup = 2a,0 cosot — 2aga sinot, 
so (1.11.4) holds provided 

1.11.9) —2ag0 = Fo, 2aio = 0, 


i.e., we have 


Fo 
1.11.10 Up(t) = — 5, tcosat. 
o 


Note that up(t) grows without bound as |¢| + oo in this case, as opposed to the 
bounded behavior in ¢ given by (1.11.6) when o? 4 k?. We say we have a resonance 
at 0? = k?. 

Moving on to (1.11.5), as in (1.9.37) the characteristic polynomial p(r) = r? + 
br + k? has roots 


bal 
(1.11.11 rep b VP Hae, 
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and as long as b > 0, +io0 # r+. Hence we can seek a solution to (1.11.5) in the 
form 


(1.11.12) Up(t) = a1 sinot + ag cos ot. 


A computation gives 


(1.11.13) a t kup = ( ayo" agbo + a,k?) sinot 

on + (—az0? +ay,bo + agk?) cos ot, 

SO Up is a solution to (1.11.5) if and only if 

(1.11.14) CS eee Ns 
on (ba )ay + (k? — 0? )az = 0. 


Solving for a; and ag gives 


k? — 0? 
“1 = Tae z fo, 
(1.11.15) ine a (0) 
a2 = Fo. 


(= oF + (0) 
We can rewrite (1.11.12) as 

1.11.16 up(t) = Asin(ot + 6), 
for some constants A and 0, using the identity 
1.11.17 Asin(ot + 0) = A(cos 6) sinot + A(sin 0) cos ot. 
It follows that (1.11.16) is equivalent to (1.11.12) provided 
1.11.18 AcosO=a,, Asind =a, 

i.e., provided 

1.11.19 ay + tay = Ae”. 

We take A > 0 such that 


F2 
1.11.20 VM=ae+iav= Q ‘ 
UES R= + Fo? 
Thus 
F 
1.11.21 [Fol 


(i = 08)? + (bo? 
is the amplitude of the solution (1.11.16). 


If b, k, and Fy are fixed quantities in (1.11.5) and a is allowed to vary, A in 
(1.11.21) is maximized at the value of o for which 


(1.11.22) B(o) = (k? — 07)? + (bo)? 
is minimal. We have 
B'(a) = 40? + 2(b? — 2k?)o 


~of-(@-5)} 


(1.11.23) 
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Note that o = 0 is a critical point, and 6(0) = k*. There are two cases. First, 


2 2. 
(1.11.24) ; 3 eae IG 2 7) 
= b? (a? = =); 
4 


since k* > b?(k? — b?/4). (Indeed, taking £ = k?/b?, this inequality is equivalent to 
&? >€-1/4, and €? — € 4+ 1/4 = (€ — 1/2)?.) In the second case, 


b2 
(1.11.25) ke 2 < 0 Bmin B(0) ae 
In these respective cases, we get 
_ |Fol (j2_ By? 
(1.11.26) Aisi = se (k = 7) 
and 
[Fo 
(1.11.27) Anax = 7 


In the first case, i.e., (1.11.24), we say resonance is achieved at ¢? = k?—b?/2. Recall 
from §1.9 that critical damping occurs for k? = b?/4, for the unforced pendulum, 
so in case (1.11.24) the unforced pendulum has damped oscillatory motion. 


Exercises 
1. Find the general solution to 
d? d 
(1.11.28) Get yp tua 3sinat. 


2. For the equation in Exercise 1, find the value of o for which there is resonance. 


3. Would the answer to Exercise 2 change if the right side of (1.11.28) were changed 
to 
10sin ot? 


Explain. 


4. Redo Exercises 1-2 with (1.11.28) replaced by each of the following: 


Pu + du + 3u = si t 
a A u = sinot, 
d? d 
a + = + 3u = 2sinot. 


5. Redo Exercise 1 with (1.11.28) replaced by the following: 


du | du Se 
qi qt t= 3sinat. 
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co | 


Figure 1.12.1. Mass on a spring 
Discuss the issue of resonance in this case. 


1.12. Spring motion 


We consider the motion of a body of mass m, attached to one end of a spring, as 
depicted in Figure 1.12.1. The other end of the spring is attached to a rigid wall, 
and the weight slides along the floor, pushed or pulled by the spring. We assume 
that the force of the spring is a function of position: 


(1.12.1) F, = F(z). 


We pick the origin to be the position where the spring is relaxed, so F'(0) = 0. 
A good approximation, valid for small oscillations, is 


1.12.2) F(x) =—Ka, 


with a positive constant K (called the spring constant). This approximation loses 
accuracy if |x| is large. Sliding along the floor typically produces a frictional force 
hat is a function of the velocity v = dx/dt. A good approximation for the frictional 
force is 


1.12.3) Fy = Fy(v) = —av, 


where a is a positive constant, called the coefficient of friction. The total force on 
he mass is F' = F) + Fy, and Newton’s law F' = ma yields the differential equation 


ax dx 
1.12.4 Wag +o7 + Ke =0. 
This has the same form as (1.9.35), i-e., 
ax dx 2 
with 
1.12.6 52. yee 
m m 


both positive, and the analysis of (1.9.35) applies here, including notions of oscil- 
latory damped, critically damped, and overdamped motion. 
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One can consider systems of several masses, connected via springs. These 
situations lead to systems of differential equations, studied in Chapter 3. 


ee 
Exercises 


1. Suppose one has a spring system as in Figure 1.12.1. Assume the mass m is 2 
kg and the spring constant K is 6 kg/sec’. There is a frictional force of a kg/sec. 
Find the values of a for which the spring motion is 


a) damped oscillatory, 


b) critically damped, 


c) overdamped. 


2. In the context of Exercise 1, suppose there is also an external force of the form 
10sinot kg-m/sec’. 
(Assume x is given in meters.) Take 
a= 2, 


so (12.4) becomes 


ax dx . 
2p + eri + 6x = 10sinot. 


Find the value of o for which there is resonance. 


1.13. RLC circuits 


Here we derive a differential equation for the current flowing through the circuit 
depicted in Figure 1.13.1, which consists of a resistor, with resistance R (in ohms), 
a capacitor, with capacitance C (in farads), and an inductor, with inductance L (in 
henrys). The circuit is plugged into a source of electricity, providing voltage E(t) 
(in volts). As stated, we want to find a differential equation for the current I(t) (in 
amps). 

The equation is derived using two types of basic laws. The first type consists 
of two rules, which are special cases of Kirchhoff’s laws: 


(A) The sum of the voltage drops across the three circuit elements is E(t). 
(B) For each t, the same current I(t) flows through each circuit element. 


For more complicated circuits than the one depicted in Figure 1.13.1, these rules 
take a more elaborate form. We return to this in Chapter 3. 
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The second type of law specifies the voltage drop across each circuit element: 
(a) Resistor: V=IR, 


(b) Inductor: V= a 
7 Q 
C tor: V=—. 
(c) apacitor a 


As stated above, V is measured in volts, J in amps, R in ohms, L in henrys, and 
C in farads. In addition, Q is the charge on the capacitor, measured in coulombs. 
The rule (c) is supplemented by the following formula for the current across the 
capacitor: 

_ 1Q 


= 4, 
In (b) and (c2), time is measured in seconds. 


(c2) I 


In Figure 1.13.1, the circuit elements are numbered. We let V; = V;(t) denote 
the voltage drop across element j. Rules (A), (B), and (a) give 


1.13.1 eked Va = EO), 
1.13.2 Y =RI. 
Rules (B), (b), and (c)—(c2) give differential equations: 
1.13.3 ye = V3 
«Lo. dt = N89 
1.13.4 cle ee 
dt 
Plugging (1.13.2)—(1.13.3) into (1.13.1) gives 
dl 
1.13.5 RI+Va+L— = Bit). 
Applying d/dt to (1.13.5) and using (1.13.4) gives 
dt dl 1 ; 
1.13.6 Lag t BG + Gl = 2. 


This is the equation for the RLC circuit in Figure 1.13.1. If we divide by L we get 
dr Rdl 1 re E'(t) 


1.13. , 
= de + Lat’ LC L 
which has the same form as the (linearized) damped driven pendulum (1.11.5), with 
R 5 1 
1.13.8 b=>, P=ze 


except that at this point E’(t)/L is not specified to agree with the right side of 
1.11.5). However, indeed, if alternating current powers this circuit, it is reasonable 
to take 


1.13.9 E(t) = Ey cosot, 
sO 
1 E 
1.13.10) zB = sinot = Fosinat. 
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Then analyses of solutions done in §1.11, including analyses of resonance phenom- 
ena, apply in this setting. 

Actually, in this setting a different perspective on resonance is in order. The 
frequency 0/27 cycles/sec of the alternating current is typically fixed, while one 
might be able to adjust the capacitance C. Let us assume R and L are also fixed, 
so 6 in (1.13.8) is fixed but one might adjust k. Recalling the formulas (1.11.16) 
and (1.11.21), which in this setting take the form 


|Fo| 
JV (k= 07)? + (bo)? 


we see that for fixed b and o, this amplitude is maximized for k satisfying 


1.13.11 I,(t) = Asin(ot +0), A 


1.13.12 k? = 07, 
ie., for 
1 
1.13.13 LC =. 
oO 


More elaborate circuits, containing a larger number of circuit elements, and 
more loops, are naturally treated in the context of systems of differential equations. 
See Chapter 3 for more on this. 


REMARK. Consistent with formulas (a)—(c) and (c2), the units mentioned above 
are related as follows: 


lomb 
Lamp 
sec 
1 farad = 1 coulom® 
vo 
(1.13.14) volt-sec 
1 henry = 1 
amp 
1 ohm =1 ace 
amp 


To relate these to other physical units, we mention that 
1 volt = 1 joule/coulomb 


1 watt = 1 volt-amp = 1 joule/sec 


(ist5) 1 joule = 1 Newton-meter 


1 Newton = 1 kg-m/sec’. 


The force of gravity at the surface of the Earth on a 1 kg. object is 9.8 Newtons, 
or, alternatively, 2.2 pounds. In other words, one Newton is about 0.224 pounds. 
Hence one joule is about 0.735 foot-pounds. 

The coulomb is a unit of charge with the following property. If two particles, 
of charge q, and q2 coulombs, are separated by r meters, the force between them 
is given by Coulomb’s law: 


(1.13.16) F=k22 Newtons, k= 8.99 x 10°. 
Tr 
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Figure 1.13.1. RLC circuit 


Investigations into the nature of electrons have shown that 
(1.13.17) —1 coulomb = charge of 6.24 x 1018 electrons. 


In connection with this, we mention that one gram of water contains 3.3 x 1078 
electrons. 


eC 
Exercises 


1. Consider a circuit as in Figure 1.13.1. Assume the inductance is 4 henrys and 
the applied current has the form (1.13.9) with a frequency of 60 hertz, i.e., 60 
cycles/sec. Find the value of the capacitance C, in farads, to achieve resonance. 


2. Redo Exercise 1, this time with inductance of 10~® henry and applied current 
of the form (1.13.9) with a frequency of 120 megahertz. 


1.14. Nonhomogeneous equations II—variation of parameters 


Here we present another approach to solving 


ax dx 
1.14.1 —, + b— 
) dt? * dt 
with constant b and c) called the method of variation of parameters. It works 
as follows. Let y(t) and yo(t) be a complete set of solutions of the homogeneous 


equation 


+ Ck = f(t), 


dy dy 


The method consists of seeking a solution to (1.14.1) in the form 


1.14.3) x(t) = ur(t)yr(t) + ue(t)ye(2), 
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and finding equations for u;(t) that are simpler then the original equation (1.14.1). 
We have 


(1.14.4) av! = uy, + uayy + uy + Ugye. 


It will be convenient to arrange that x” not involve second order derivatives of u4 
and uz. To achieve this, we impose the condition 


(1.14.5) ulyi + ugy2 = 0. 


Then 2” = uly, tuys +uiyy +uayg, and using (1.14.2) to replace y'’ by —by} —cy;, 
we get 


(1.14.6) a" = uy + ugys — (by, + cyr)ur — (bys + cy2)ur, 
hence 
(1.14.7) wv" + ba! + cx = yu + ygus. 


Thus we have a solution to (1.14.1) in the form (1.14.3) provided ui and u5 
solve 


/ / 
yiuy + Yyotty = 0, 


(1.14.8) 
vi + abu =F 


This linear system for u4, and uw‘ has the explicit solution 


Y2 Y1 
1.14.9) uy = apis ub = a 
where W(t) is the following determinant, called the Wronskian determinant, 
1.14.10) W = yyy — yoy = det & ) 
WY Ye 


Determinants will be studied in the next chapter. The reader who has not seen 
hem can take the first identity in (1.14.10) as a definition and ignore the second 
identity (for now). 


Note that if the roots of the characteristic polynomial p(r) = r? + br +c are 
distinct, r+ # r_, we can take 


1.14.11 yi =ett, yo =e-t, 

and then 
W(t) =r_e"+te"-* — rpe”te™+? 

1.14.12 Bega ryelratrJt, 
which is nowhere vanishing. If there is a double root, r, = r_ =r, we can take 

1.14.13 yr =e", yo = te”, 

and then 

1.14.14 Wit) See ore) ater es 


which is also nowhere vanishing. 
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Returning to (1.14.9), we can take 


: y2(s) i 
ur(t) ‘i Ws)t ds+ Ci, 


(1.14.15) 


wots poe o f(s) ds + Cp, 
> WC 


sO 


(1.14.16) a(t) = Crys (t) + Coya(t) +f [yOn(s) = ni(t)ua(s)| ) ds. 


Denote the last term, i.e., the integral, by z,(t). 


Note that when the characteristic polynomial r? + br +c has distinct roots 
r, #r_ and (1.14.11)-(1.14.12) hold, we get 


t 
xp(t) = = i [enters - efter a ds 
(1.14.17) is ; ae 7 
= | jeri? - e™+(t-8)] f(s) ds. 
r-— T+ Jo 


When the characteristic polynomial has double roots r+, = r_ = r and (1.14.13)— 


(1.14.14) hold, we get 
t 
_ rt ors _ prt ors f(s) 
=) = f [te e€ ese |= ds 


(1.14.18) F 
= | (t — s)e"*-®) f(s) ds. 

0 

The next section will continue the study of the Wronskian. Further material 
on the Wronskian and the method of variation of parameters, in a more general 
context, can be found in Chapter 3. 


a) 
Exercises 


Use the method of variation of parameters to solve each of the following for x = x(t). 


1. 

ce’ t+a =e. 
2. 

xv" +a” =sint. 

3. 

e'+re=t 
4. 

gl +e=t? 
5. 


xv’ +a = tant. 
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1.15. Variable coefficient second order equations 


The general, possibly nonlinear, second order differential equation 
ax dx 
1.15.1 <= F(t =), 
te) de meee 
has already been mentioned in §1.4. If F(t, x, v) is defined and smooth on a neigh- 
borhood of to, Z9, V9, and one imposes an initial condition 
(1.15.2) x(to) =o, x’ (to) = 0; 


it is a fundamental result that the initial value problem (1.15.1)—(1.15.2) has a 
unique solution, at least for ¢ in some interval containing to. A more general result 
of this sort will be proven in Chapter 4. 


Linear second order equations have the form 


ax dx 
de + b(t) oP + c(t)x = f(t). 


The existence and uniqueness results stated above apply. There are many specific 
and much-studied examples, such as Bessel’s equation, 


(1.15.3) a(t) 


@x ldxr v? 
(1.15.4) at teat (l-5)2=0, 
whose solutions are called Bessel functions, and Airy’s equation, 
ax 
(1.15.5) qe —tx=0, 


whose solutions are Airy functions, just to mention two examples. Such functions 
are important and show up in many contexts. We will take a closer look at Bessel’s 
equation in the next section. Linear variable coefficient equations could arise from 
RLC circuits in which one has variable capacitors, resistors, and inductors, turning 
(1.13.6) into 
dt dI 1 

i a / 
qe cw’ Ne 
The most frequent source of such equations as (1.15.4)—(1.15.5) comes from the 
theory of partial differential equations (PDE). One such indication of how (1.15.4) 
arises is given in Appendix 1.A. The reader can find out much more about these 
equations in a text on PDE such as [45]. Solutions to these equations cannot 
generally be given in terms of elementary functions, such as exponential functions, 
but are further special functions, for which many analytical techniques have been 
developed. 


(1.15.6) L(t) 


As with the exponential function, analyzed in §1.1, power series techniques are 
very useful. We illustrate this by producing a power series 


(1.15.7) a(t) = Sag 
k-0 


for the solution to the Airy equation (1.15.5), with initial data 
(1.15.8) 2(0)=1, 2«(0)=0. 
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If (1.15.7) is a convergent power series, then 


w(t) = s k(k — 1)agt*-? 


(1.15.9) ie 
= SVE +2)(b+ Vanya, 
k=0 
while 
1.15.10 ta(t) = So ax_it®. 
k=1 


Comparison gives the recursive formula 


1.15.11 al 


ORS Ce B)(k+ 2) 
To get started, we note that 


1.15.12 ag =2(0)=1, a =2'(0)=0, as x" (0) = 0. 


Thus a3e4; = 0 for 7 = 1,2, and we get 
[oe] 
1.15.13 TS oat 
£=0 


where ae = aze is given recursively by 
ae 
1.15.14 = , 

e+ = (304 3)(30 4 2) 
The ratio test applies to show that the power series (1.15.13) converges for all t € R, 
yielding a solution to Airy’s equation (1.15.5), with initial data (1.15.8). 


ag =1. 


A study of power series as a technique for solving ODE in a more general setting 
is given in §3.10. 

Another useful tool is the Wronskian determinant, defined on a pair of functions 
yi and yo by 


1.15.15 W (yi, ¥2)(t) = yiys — yoy, = det G | ; 
Yr Ye 

If y; and yp both solve (1.15.3) with f = 0, ie., 

1.15.16 a(t)y” + b(t)y’ + c(t)y = 0, 


then substituting for yy} in 


dW 
1.15.17 7 iva ~ yaw 
yields 
1.15.18 OW, SU 
dt a(t) 


a useful first order linear equation for W. Note that if we have such y; and yo, 
solving (1.15.16) with initial condition 


(1.15.19) y(to) =a, y'(to) = B, 
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in the form y(t) = Cryi(t) + Coyo(t) involves finding C; and C2 such that 
Crys (to) + Ca2ya(to) = a, 

Cry (to) + C2yo(to) = 8, 

which uniquely determines C; and C2 precisely when W (y1, y2)(to) 4 0. 


(1.15.20) 


In light of the existence and uniqueness statement made above (to be proved 
in Chapter 4), it follows that if y,; and yo solve (1.15.16) and have nonvanishing 
Wronskian, on an interval on which a,b, and c are smooth and a is nonvanishing, 
then the general solution to (1.15.16) has the form C,y, + Coye. 


Recall that the Wronskian arose in the previous section, in the treatment of 
the method of variation of parameters. This treatment is extended to a much more 
general setting in Chapter 3. 


| SE _ =| 
Exercises 


Equations of the form 


12 
Sg 


Qe 
(1.15.21) at? 7 


+cx=0 


are called Euler equations. 


1. Show that x(t) = t” = e”!°8* solves (1.15.21) for t > 0 provided r satisfies 
(1.15.22) ar(r—1)+br+c=0. 


2. Show that if (1.15.22) has two distinct solutions r; and re, then 
Cyt" + Cot" 
is the general solution to (1.15.21) on t € (0,00). 


3. Show that if r is a double root of (1.15.22), then 
Cit” + Co(log t)t” 
is the general solution to (1.15.21) for t € (0,00). 


4. Find the coefficients a, in the power series expansion 


co 
x(t) = > apt* 
k=0 


for the solution to the Airy equation 


(1.15.23) — tx =0, 


with initial data 
x(0)=0, 2(0)=1. 


Show that this power series converges for all t. 
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5. Show that the Wronskian of two solutions to the Airy equation (1.15.23) solves 
the equation 
dw 
“dt. == 0. 


1.16. Bessel’s equation 


Here we construct solutions to Bessel’s equation 
@xr 1dx v? 
1.16.1 (1-4 )e=0. 
) a a pe )* 


This is a very important equation, whose roots in partial differential equations are 
discussed in Appendix 1.A. Note that if the factor (1—v?/t?) in front of x had the 
erm 1 dropped, one would have the Euler equation 


1.16.2 Pe" + ta! —v?x2 =0, 

with solutions 
1.16.3 a(t) =t*, 

as seen in (1.15.21)—(1.15.22). In light of this, we are motivated to set 
1.16.4 x(t) = t’y(t), 


and study the resulting differential equation for y, 


dy  w+1 dy 
dt? t dt 
This might seem only moderately less singular than (1.16.1) at ¢ = 0, but in fact it 
has a smooth solution. To obtain it, let us note that if y(t) solves (1.16.5), so does 


y(—t), hence so does y(t) + y(—t), which is even in t. Thus, we look for a solution 
to (1.16.5) in the form 


1.16.5 


+y=0. 


(1.16.6 y(t) = So ayt®. 
k=0 
Substitution into (1.16.5) yields for the left side of (1.16.5) the power series 
(1.16.7) Do { (2k + 2)(2k +2 + 2arsa + a4 bi, 
k=0 


assuming convergence, which we will examine shortly. From this we see that, as 
long as 


(1.16.8) vy ¢ {-1, -2,-3,...}, 

we can fix a9 = ao(v) and solve recursively for ax41, for each k > 0, obtaining 
1 ar 

4A (k+1)(k+u +1) 


Given (1.16.8), this recursion works, and one can readily apply the ratio test to 
show that the power series (1.16.6) converges for all t € R. 


(1.16.9) Any = 
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We will find it useful to produce an explicit solution to (1.16.9). For this, it is 
convenient to write 


1.16.10 ak = An Or Yr, 
with 
1 Br Yk 
1.16.11 41 = —-<C b+. = b41 = : 
6 Ok+1 qth Prva BES = a 
Clearly the first two equations have the explicit solutions 
1\* Bo 
1.16.12 a (-3) ao, Be = 5 
We can solve the third if we have in hand a function I'(z) satisfying 
1.16.13 T(z4+1) = 2P(z). 


Indeed, the Euler gamma function ['(z), discussed in Appendix 1.B, is a smooth 
function on R\ {0,—1,—2,...} that satisfies (1.16.13). With this function in hand, 
we can write 

Bye On, 
eS TK hae) 
and putting together (1.16.10)—(1.16.14) yields 


(1.16.14) 


ie ae 
peel) ax =(-3) kID(R ty 1)" 


We initialize this with @9 = 2~”. There results the solution y(t) = 7,(#) to (1.16.5), 
and x(t) = J,(t) =t” Z(t) to (1.16.1), given by 


co -1 k ty 2kt+v 
(1.16.16) 10 => are aaa (5) 


Supplementing the regularity of [(z) on R\ {0, —1, —2,...}, we will see in Appendix 
1.B that 


is well defined and smooth in z € R 


(1.16.17) T(z) 
vanishing for z € {0,—1,—2,...}. 


Consequently (1.16.16) is a valid solution to (1.16.1) for t € (0,00), for each vy € R. 
In fact, 
(1.16.18) J, and J_, solve (1.16.1), for vy € R. 
The function J, is called a Bessel function. 
Let us examine the behavior of J,(t) as t \, 0. We have 


1 t\" 
fee oe ee y+1 7 
(1.16.19) J, (t) = Tea) +0(t"T), as t\,0. 
As long as v satisfies (1.16.8), the coefficient 1/[(v + 1) is nonzero. Furthermore, 
1 t\=¥ 
= oa —v+1 i 
(1.16.20) J_,(t) = mira) +0(t-"t}), as t\,0, 
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and as long as v ¢ {1,2,3,...}, the coefficient 1/['(1—v) is nonzero. In particular, 
we see that 


Ifv ¢ Z, J, and J_, are linearly independent solutions 


1.16.21 
to (1.16.1) on (0, 00). 
In contrast to this, we have the following: 
1.16.22 Ifn eZ, Jn(t) = (—1)"J_n(t). 
To see this, we assume n € {1,2,3,...}, and note that 
1 
1.16.23 =———. = 0, for 0<k<n-1. 
T(k—-n+1) i 
We use this, together with the restatement of (1.16.16) that 
29. (—1)* t\ 2k+v 
1.16.24 JL(t) = , 
( 2 Do reeDReey sy (3) 


which follows from the identity [(& + 1) = k!, to deduce that, for n € N, 


SS (-1)* t\ 2k—-n 
J_n(t) = be T(k+)P(k—-n+1) (3) 


- oe (-1 etn £\ 2¢+n 
d geen) 


= (-1)"Jn(t)- 


(1.16.25) 


Consequently J,(t) and J_,(t) are linearly independent solutions to (1.16.1) as 
long as v ¢ Z, but this fails for v € Z. We now seek a family of solutions Y,(t) to 
1.16.1) with the property that J, and Y, are linearly independent solutions, for 
all € R. The key to this construction lies in an analysis of the Wronskian, 


1.16.26 Welt) =W Iv S-v)(t) = Jol)", (t) — JE(t)I-vlt). 
By (1.15.10), we have 
1.16.27 ane TW, 
dt A 
hence 
1.16.28 W,(t) = Kw) 


To evaluate K(v), we calculate 


W(J.,J-v) =W(t’ Ht” F-v) 
(1.16.29) Qu 


= W(I,I-v) = = At) I-v(t). 


Since J,(t) and J_,(t) are smooth in t, so is W(J,,7-v), and we deduce from 
(1.16.28)—(1.16.29) that 


(1.16.30) Welt) = ~~ F,(0)F-v(0). 
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Ji (t) 


Ni(t) 


Figure 1.16.1. Graphs of J (¢) and Yj (¢) 


Now, since 7,(0) = 1/2’T(v + 1), we have 


Vv 
VI,(0)F-v(0) = Tw +Dra-» 
(1.16.31) : 
~ PATS o) 
An important gamma function identity, stated in Appendix 1.B, is 
T 
1.16.32 Tv) -v)= 2 
oe PEK y) sin TV 
Hence (1.16.30)—(1.16.31) yields 
2 si 
1.16.33 W(J,,J-v)(t) = —2 a 
T 


This motivates the following. For v ¢ Z, set 
J, (t) cos mv — J_,(t) 


sin Ty 


1.16.34 Y,(t) = 


Note that, by (1.16.25), numerator and denominator both vanish for vy € Z. Now, 
for v ¢ Z, we have 


1 
W(IL,¥,)(t) = -=—— W (J, J-v) (6) 
(1.16.35) aoe 
~ tt 
Consequently, for n € Z, we set 
Lrod(t OJ_1(t 

(1.16.36) Y,(t) = lim Y,(t) = i () _ (yn Fu i 

von TT Ov Ov v=n 


and we also have (1.16.35) for v € Z. The functions Y, are called Bessel functions 
of the second kind. See Figure 1.16.1 for graphs of J;(t) and Yj(t). 
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Another construction of a solution to accompany J;,(t) is given in Chapter 3, 
in formulas (3.11.65)—(3.11.79). 


We end this section with the following integral formula for J,(t), which plays 
an important role in further investigations, such as the behavior of J,(t) for large 
iz 


Proposition 1.16.1. [fv > —1/2, 


v 1 
(1.16.37) J,(t) = ROE CeST a s?)P—T2eist dg, 


Proof. To verify (1.16.37), we replace e** by its power series, integrate term by 
term, and use some identities from Appendix 1.B. To begin, the integral on the 
right side of (1.16.37) is equal to 


ot Yeah i 2)v—1/2 
(1.16.38) » (x)! [0 (1 - s*) ds. 
The identity (1.B.17) implies 
: T(k+1/2)P(v + 1/2) 
2k(y _ g2\v-1/2 7, 
(1.16.39) I (1 — s*) ds Tk+v +1) , 


so the right side of (1.16.37) equals 


(t/2)” s 1 nee Dt 1/2)T(v + 1/2) 
(2k)! 


waeey) T(/2)P + 1/2) T(k+v +1) 


k=0 


As seen in (1.B.7), we have 
1 1 
mg | = 2?*k! is 
(1.16.41) P(5) 2h)! 2 KIE(k+5), 


so (1.16.40) is equal to 


yp & 24 \k 
(1.16.42) (5) ae 


which agrees with our formula (1.16.16) for J,(t). 


See §3.11 for general results on ODE with regular singular points, with reference 
to the study of Bessel’s equation. Further material on Bessel functions, including a 
study for large values of the argument t, and also for complex values, can be found 
in Chapter 7 of [47]. 
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Exercises 


1. Show that the Bessel functions J, satisfy the following recursion relations: 


© (t’4u(t)) = t"’ Jy_i(t), £(e"ild)) =-t" J,4i(t), 


or equivalently 
Ivar) = — Fil) + SH), 


Jy-u(t) = Hyt) + FIle). 
2. Show that J_1/2(t) = \/2/m cost, and deduce that 


/ 2 fide. 
J_1j2(t) = ay cost, Jij2(t) = ay Sint. 


Deduce from Exercise 1 that, for n € Z*, 


Ins1j2(t) = cor{T(4 ! =) } ae 


j=l 


Jn —1pa(t) = Sane ! =72) } ae 


jal 


Hint. The differential equation (1.16.5) for J_1/2 is y” + y = 0. Since J_1/2(t) 
is even in t, J_1/2(t) = Ccost, and the evaluation of C comes from J_1/2(0) = 


V2/T (1/2) = \/2/z, thanks to (1.B.6). 


3. Show that the functions Y, satisfy the same recursion relations as J,, i.e., 


d vy vy d —v a -—v 
A ¥.(t)) =t Y,-1(t), a(t ¥.(t)) =-t Ypai(t). 
4. The Hankel functions HS (t) and H(t) are defined to be 

HM) = L(t) +iv(t), H(t) = L(t) — iy (t). 


Show that they satisfy the same recursion relations as J,, i.e., 


d v j v j d —v j —v j 
(eH @) =¢H2.0, SHPO) = HD, 
for 7 = 1,2. 
5. Show that 

AQ) =e HOD), HOW) =e HO). 
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6. Show that Y1/2(t) = —J_1/2(t), and deduce that 


weg a 2) (4) =, / 2 eit 
(t) =-7 nit Hy), (t) =% ayes 


1.17. Higher order linear equations 


re) 


A linear differential equation of order n has the form 


dx dy 
(1.17.1) On (t) + an—1(t) er t+» +ao(t)a = f(t). 
If a;(t) are continuous for ¢ in an interval J containing to, and a,,(¢) is nonvanishing 
on this interval, one has a unique solution to (1.17.1) given an initial condition of 
the form 


1.17.2) (to) =a, 2/(to) =an,..., 2") (to) = an_t. 


As with (1.15.1)—(1.15.2), this also follows from a general result that will be es- 
tablished in Chapter 4.) If aj;(t) are all constant, the equation (1.17.1) has the 
form 


dx d™—'y 
1.17.3) On Fm + On-1 gma +7 + aoe = F(t). 
It is homogeneous if f = 0, in which case one has 
d"x d’—te 
1.17.4) An ape An—1 et f---tagx =0. 


We assume a, # 0. 

Methods developed in §§1.9-1.10 have natural extensions to (1.17.4) and (1.17.3). 
The function x(t) = e™ solves (1.17.4) provided r satisfies the characteristic equa- 
tion 
(1.17.5) Ant” + an—1r" | +++ + a9 =0. 


The fundamental theorem of algebra guarantees that (1.17.5) has n roots, i.e., there 
exist 71,..-,Tm € C such that 


(1.17.6 Ant” + an—1r 1b +--+ +49 = Gn(r—11)+++ (rT —Tn). 

A proof of this theorem is given in §2.C. These roots rj,..., 7, may or may not be 
distinct. If they are distinct, the general solution to (1.17.4) has the form 

(1.17.7 a(t) = Cye™ +---+Che™. 

If r; is a root of multiplicity k, one has solutions to (1.17.4) of the form 

(1.17.8 Cre™* + Cote"? + + Cth ter. 


This observation can be used to yield a fresh perspective on what makes the calcu- 
lations in §1.10 work. Consider for example the equation 


(1.17.9) aa” + be’ + cx =e. 
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The right side solves the equation (d/dt— «)e** = 0, so any solution to (1.17.9) also 
solves 


d a d 
(1.17.10) (5 r) (as boo c)x 0, 
a homogeneous equation whose characteristic polynomial is 
(1.17.11) q(r) = (r — «)(ar? + br +c) = (r — &)p(r). 


If « is not a root of p(r), then certainly (1.17.9) has a solution of the form Ae". If 
« is a root of p(r), then it is a double (or, perhaps, triple) root of g(r), and (1.17.8) 
applies, leading one to (1.10.17) or (1.10.25). 

One can also extend the method of variation of parameters to higher order 
equations (1.17.3), though the details get grim. 


The equations (1.17.1)—(1.17.4) can each be recast as n x n first order systems 
of differential equations, and all the results on these equations are special cases of 
results to be covered in Chapter 3, so we will say no more here, except to advertise 
that this transformation leads to a much simplified approach to the method of 
variation of parameters. 


——<—<*_*—iiE—i——~“_=_—=_ i —|]__== 
Exercises 


1. Assume the existence and uniqueness results for the solution to (1.17.1) stated 
in the first paragraph of this section. Show that there exist n solutions u; to 

ain (t)us” (t) + an—a(thu"—Y (t) + +++ + ao(t)uj(t) = 0 
on I such that every solution to (1.17.1) with f = 0 can be written uniquely in the 
form 

x(t) = Cyus(t) +--+ + Chun(t). 

For general continuous f, let 2, be a particular solution to (1.17.1). Show that if 
x(t) is an arbitrary solution to (1.17.1), then there exist unique constants C;, 1 < 
j <n, such that 


a(t) = xp(t) + Cyur(t) +--+ + Crun(t). 
This is called the general solution to (1.17.1). 


(1) (45) = djk, 1<k <n, where 6;, =1 for 7 =k, 0 for 7 Fk. 


Hint. Require U; 


2. Find the general solution to each of the following equations for 7 = x(t). 


(a) 


d‘x 
ae t-8 
(b) 
Bax 
—-x=0. 


dts 
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al” — Ia" — 4a’ + 82 = 0. 


al” — Ia" + 4a’ — 82 = 0. 


me t 


3. For each of the cases (a)—(e) in Exercise 1 of §1.10, produce a third or fourth 
order homogeneous differential equation solved by z(t). 


Exercises 4—6 will exploit the fact that if the characteristic polynomial (1.17.6) 
factors as stated there, then the left side of (1.17.4) is equal to 


oo(dan) = (f-m)e= eld -n)= 


4. Show that 
(5 - r3) (e"u) = e(s Sty ru, 


and more generally 
n d : ; n d 
WG z r;) (e"'u) =e" NG =Tger r) U. 


5. Suppose r; is a root of multiplicity k of (1.17.6). Show that z(t) = e’/*tu solves 


(1.17.4) if and only if 
d d\k 
Ly get) a) = 
{reArs} 
Use this to show that functions of the form (1.17.8) solve (1.17.4). 


6. In light of Exercise 5, use an inductive argument to show the following. Assume 
the roots {rj} of (1.17.6) are 


ry, with multiplicity kp, l<u<m, ky+---+ky =n. 
Then the general solution to (1.17.4) is a linear combination of 


the! 0<t<k,-1, l<v<m. 


1.18. The Laplace transform 


The Laplace transform provides a tool to treat nonhomogeneous differential equa- 
tions of the form 

d” f qdr-1 f 
(1.18.1) Cn dt” r Cp —1 dtr-1 an cof (t) = g(t), 


(1.18.2) f(0) =a9,..., f°) (0) = an-a, 
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for certain classes of functions g. It is defined as follows. Assume f : Rt > C is 
integrable on [0, R] for all R < oo, and satisfies 


(1.18.3) | lf(jle dt<o0, Va>A, 
0 
for some A € R. We define the Laplace transform of f by 
(1.18.4) Lf(s)= i) fe “dt, Res>A. 
0 


By our hypotheses, this integral is absolutely convergent for each s in the half-plane 
Hy, ={s €C: Res > A}. For our current purposes, it will suffice to take s real, 
in (A,co). Note that, for such s, 


1.18.5 “ c4(s) =Lg(s), g(t) =—tf(t). 
If we assume that f’ is continuous on [0, 00) and 
1.18.6 If +f (| < CoeAt*, for > 0, 
for each ¢ > 0, we can integrate by parts and get 
1.18.7 Lf'(s) = sLf(s) — f(0), 
and similar hypotheses for higher derivatives of f gives 
1.18.8 Lf (s) = s*Lf(s) — s*-1f(0) —--- — f*-Y (0). 
Hence, if f satisfies an ODE of the form (1.18.1)—(1.18.2) and if f, f’,... f- all 
satisfy (1.18.6), and g satisfies (1.18.3), we have 
1.18.9 p(s)Lf(s) = Lg(s) + q(s), 
with 


8) =Cas” +en_18™ | +++ +0, 
1.18.10) ie ne eitet 0 


a(s) = Cn(ags”—1 +--+ + Gn—1) +++ + e1a9. 


If all the roots of p(s) satisfy Res < B, we have 
_ £9(s) + 4(8) 
p(s) 


Making use of (1.18.11) to solve (1.18.1)—(1.18.2) brings in two problems, which we 
now state. 


(1.18.11) Lf(s) , Res >C=max(A,B). 


I. THE RECOGNITION PROBLEM. Given the right side of (1.18.11), ie., given 
La(s) + 4(s) 


1.18.12 = R(s), 
) Ble) (s) 
find a function fi : [0,00) > C, such that 
1.18.13) Lfi(s) = R(s), for Res>C. 


II. THE UNIQUENESS PROBLEM. Given f and f; : [0,00) — C, both satisfying 
1.18.3), one wants to know that 


1.18.14) Lf(s) =Lfi(s), Vs > A => f = fi on [0, 00). 
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The uniqueness problem has a satisfactory solution. As long as f and f; satisfy 
the hypotheses just stated, the result (1.18.14) is true. The proof of this can 
be found in §3.3 of [47]. In addition there are inversion formulas. Here is one, 
established in §3.3 of [47]. 


Proposition 1.18.1. Assume f and f’ are continuous on [0,0co), and 
(1.18.15) IF@/+1/@|< Ce, t>0. 
Then, fort > 0, 


(1.18.16) tf(t) = = de “cf B+ if)et(B+®) de, 


as long as B > A, with an absolutely convergent integral on the right side. 


In light of the uniqueness, if f satisfies (1.18.3), we say 
(1.18.17) g=lLf = f=L-"9, 


and call £~! the inverse Laplace transform. 


Generally speaking, for functions R(s) that arise in (1.18.12), calculation of the 
integral 
oo 4s 
(1.18.18) ip R'(B + i€é)e"s dé 
—0o 
is not so easy, though methods of residue calculus, discussed in §4.1 of [47] can be 
effective. For the purpose of using (1.18.11) to solve (1.18.1)—(1.18.2), by finding f 
that satisfies 
(1.18.19) Lf(s) = R(s), 
with R(s) as in (1.18.12), it is useful to have a collection of functions that are known 
Laplace transforms, in order to solve the recognition problem. 
To start our collection, we consider the Laplace transform of e%, 
co co 1 
(1.18.20) | ete dt = | eo 8-9! de = ——_, 
0 0 od 
If a is real, this is valid for Res > a. However, using results from §1.1, we find it 


useful to note that (1.18.20) holds for complex a, as long as Res > Rea. We can 
apply this to 


13s : 
(1.18.21) f(t) = cosat = xem fee tab), 


for a € R, to get 


ive 1 
(1.18.22) bia) = asia | aa 


s2 + @2- 
Similar techniques yield the table of Laplace transforms presented in Table 1. 

If a € R, the range of validity of (a)-(b) is Res > 0, and that of (c)—(d) is 
Res > |al. 
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Table 1. Table of Laplace transforms 


fh) Ls) 
(a) sinat a/(s? +a?) 
(b) cosat s/(s? +a?) 
(c) sinhat a/(s? —a?) 
(d) coshat s/(s? — a?) 


Laplace transforms of other functions, such as e~"* 


via the identity 


cos at, etc., can be identified 


1.18.23 L(e~™ f)(s) = Lf(s +b). 
Also, one can turn (1.18.5) around, to write 
d 
1.18.24 Li(tf)(s) = =a Lf(s), 
and, inductively, 
d”™ 


1.18.25 L(t” f)(s) 


ll 
ar 
A 
<= 

5 
a 
Sb 
— 
SS 


For example, 

d”™ 

ds” 
n! 

= (s—a)rtl’ 


(s—a)™* 


L(te**)(s) = (-1)" 
(1.18.26) 


for a€ C, Res > Rea. In particular, 
(1.18.27) f(t) = 0" => L£f(s) = nls, 


Of course, by (1.18.23), the result (1.18.26) follows from its special case (1.18.27). 
A natural generalization of (1.18.16) arises from taking 


(1.18.28) A) sr, 250, 
We get 
Lf.(s sy eet dt 
0 
(1.18.29) = ( yet ‘at)s 
0 
=T(z)s~ 
where 
(1.18.30) r= f ee hd. 2 
0 


is the Gamma function, which plays a role in §1.16, via (1.16.13)—(1.16.16), and is 
treated in Appendix 1.B. Let us note that (1.18.24) implies 


d 
(1.18.31) Lfz4i(s) = — Gee Fels), 
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Table 2. Further Laplace transforms 


f(t) Lf(s) 

(e) tt T(z)s~* 

(f) logt —(7 4+ log s)/s 

(g) (logt)t**  (I”(z) — T(z) log s)/s* 

(h) = t? he T(z)(s — a)~? 
which in view of (1.18.29) is equivalent to the identity 
(1.18.32) T(z+1) = (2). 
Also comparison of (1.18.27) and (1.18.29), with z =n +1, yields 
(1.18.33) T(n+1) =n! 


We can obtain another Laplace transform identity by applying d/dz to (18.28), 
noting that, since s~* = e~7!°88, 


d 


1.18.34 a =—(logs)s-*, s>0, 
with an analogous formula for (d/dz)t7~!: 
d 
1.18.35 —t*—! = (log t)t?71. 
dz 


Hence (1.18.29) yields 
1.18.36 f(t) = (log t)t*~' > Lf(s) = (I’(z) —T(z) logs)s~*. 


In particular, 


f(t) =logt > Lf(s) = (T’(1) — log s)s7! 
_  ytlogs 
ee an 


1.18.37 


where y = —I’(1) is known as Euler’s constant. Taking s = 1 in (1.18.37), we have 
the formula 


(1.18.38) y= -{ (log t)e~t dt. 
0 


Collecting these results, we complement the table of Laplace transforms compiled 
in Table 1 with that in Table 2. Note that (h) follows from (e), via (1.18.23). One 
has similar variants of (f)—(g). 


Another function to consider is the impulse function 
vi(t)=1, if tel, 
0, if t¢Z. 


where J = [a, }] is an interval, with 0 < a < b < co. We have 


(1.18.39) 


eas _ eos 


b 
(1.18.40) £xi(s) = | e dt = - 


s 


Let us apply the Laplace transform method to the following initial value prob- 
lem. Take k,a,ao,a1 € R, and consider 


(1.18.41) f(t) +k f() =cosat, f(0)=a0, f’(0) =a1. 
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From (1.18.8), 


1.18.42 Lf" (s) = s°Lf(s) — aos — a1, 
and since £(cos at)(s) = s/(s? +a”), (1.18.11) becomes 
1.18.43 CASES a poe 


(s?2 + k2)(s2 +a?) ° 52+ h2 ° 


The last term on the right is the Laplace transform of 


1.18.44 ag cos kt + > sin kt. 


It remains to write the first term on the right side of (1.18.43) as a Laplace trans- 
form. For this, we apply the method of partial fractions. To start, we try 
8 as+B  ys+o6 


1.18.4 = ; 
aaa) (s?+k?)(s?+a?)  s?+a?% 9 5? +k?’ 


with unknowns a, 8,7,6. Multiplying through by (s? + k?)(s? + a?) and equat- 
ing coefficients of various powers of s leads to four linear equations in these four 
unknowns. Two of them yield a = —7y and 8 = —6, and then the other two become 
(1.18.46) (k2—a?)a=1, (k?—a?)B=0. 


If k? 4 a”, these are uniquely solvable, for a = (k? — a?)~1, 8 = 0, and (1.18.49) 
becomes 


8 = 1 ( Ss 8 ) 
(s2 + k?)(s? +a?) k2—a2\s? + a2 524k?) 


This is the Laplace transform of 


1.18.47 


1 
1.18.48 Pa,k(t) = Fa ga (098 at — cos kt). 
—a 


Then the solution to the differential equation (1.18.41) is 


1.18.49 f(t) = Yan (t) + ao cos kt + . sin kt. 


This approach fails for k? = a?, paralleling the situation we encountered in 
examining (1.11.4). One way to treat this exceptional case is to pass to the limit 
in (1.18.48), obtaining 


rk (t) = lim Pa,k(t) 
ak 


1 cosat — cos kt 


(1.18.50) WO ape 


Another approach is to refine the method of partial fractions. In lieu of (1.18.45), 
we have 
S S 


(2+? = tie — ak? 


(1.18.51) 


= pater ae (s a): 
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Using (1.18.26), with n = 1, we have 


as (Grap)o=0™ 


Hence the right side of (1.18.51) is the Laplace transform of 
i 
4k 


and again we obtain the conclusion of (1.18.50), from a different perspective. 


(1.18.53) (te~*** — te’) = - sin kt, 


In light of this analysis, and recalling (1.18.12), we are motivated to compute 
the inverse Laplace transform of functions of the form q(s)/p(s), where p(s) is a 
polynomial of degree n, say 


(1.18.54) p(s) = 8” +en_18"-1 +++ + ¢0, 

and q(s) is a polynomial of degree < n—1. The polynomial p(s) has complex roots 
T1,+++;Tm, of multiplicity ky,...,km, and we can write (1.18.54) as 

(1.18.55) p(s) =(s—11)*---(s —tm)P™, ky tess thm = 1. 


This is a consequence of the fundamental theorem of algebra, which is proved in 
Appendix 2.C. The following is an incisive result on the method of partial fractions. 


Proposition 1.18.2. If p(s) is a polynomial of the form (1.18.55), with {r1,...,rm} 
distinct, and if q(s) is a polynomial of degree < n—1, then there exist unique 
aye EC, forl1<fl<m, 1<j < ke, such that 


ke 


q(s) = aje 
(1.18.56) ae) > Crh 


t=1 j=l 


B 


Proof. We use some concepts developed in Chapter 2. The set of collections (a;¢) 
of the form 


{ajeE€C:1<jck,l<l<m} 


forms a vector space Vo, of dimension k; +--+: + km =n. Meanwhile, the space 
Pn—1 of polynomials g(s) of degree < n — 1 is also a vector space of dimension n. 
Now the correspondence in (1.18.56) yields a well defined linear map T from Vo to 
Pn—1, given by T(aje) = q(s), the numerator in the left side of (1.18.56), and one 
can verify that this map is one-to-one. Hence (cf. Corollary 2.3.7 of Chapter 2), 
this map is also onto, and this gives Proposition 1.18.2. 


Given the representation (1.18.56), we deduce from (1.18.26) that 


(1.18.57) ct (2) (t) = Sit _gj—leret 
Pp ey ne 
Taking g(s) = 1, we obtain a function y(t), of the form (1.18.57), such that 
1 
(1.18.58) Lo(s) = — 


p(s)” 
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Then the solution f(t) to (1.18.1)-(1.18.2) is equal to £~!(q/p)(t) plus fo(t), sat- 
isfying 

L£4(s) 
p(s) 


The following result provides a useful integral formula for fo. 


(1.18.59) Lfo(s) = = Ly(s)La(s). 


Proposition 1.18.3. Let y and g satisfy (1.18.3), and set 


t 
(1.18.60) pxg(t) = 7 y(t — T)g(r) dr. 
0 
Then, fors >A, 
(1.18.61) Lip * 9)(s) = Ly(s)La(s). 
Proof. Given (1.18.60), we have 
oo t 
Lex gs) = fo est [ott n)u(r)arat 
0 0 
oo t 
= | | eS!) 687 o(t — 7) g(r) dr dt 
0 Jo 
= i eS! e—8T o(t — 7) g(r) dt dr 
0) T 


(1.18.62) 


= Ly(s) ie e °7g(r) dr 
= Ly(s)La(s), 


as asserted. 


Recall that the method of variation of parameters, discussed in §1.14, also 
leads to an integral formula involving an integral over [0,¢]. In fact, the method of 
variation of parameters and the use of the Laplace transform discussed here can both 
be understood as special cases of a general method, involving Duhamel’s formula, 
arising when the equations are recast as first-order systems. This is explained in 
§3.9 and §3.B. 


$e 
Exercises 


1. Compute the inverse Laplace transform of the following functions. 


1 
(@) a7 
s+1 
b Te ea ay 
) s3 + 3s? + 2s 
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2. Use the Laplace transform to solve the following initial value problems. 
(a) f(t) £3f"() + 2F(t) = e-tsint,  f(0) =0, #"(0) = 1, 
(6) f(t) -—f@ =sint, f(0) =0 for0<j <3. 


3. Show that 


Hint. By (1.18.5), 


d 
ase ss) = —L(tf)(s) = — 
S 8 
Integrate, and find the constant of integration using 


Jim Lf(s) = 0. 


4. Compute the Laplace transform of 
1—cost 
eo 


1.A. The genesis of Bessel’s equation: PDE in polar coordinates 


Bessel functions, the subject of $1.16, arise in the natural generalization of the 
equation 


1.A.1) ee +ku =0, 
with solutions sinkx and cos kz, to partial differential equations 
1.A.2) Au+k?u=0, 
where A is the Laplace operator, acting on a function u on a domain 2 C R” by 
Pu Oru 


We can eliminate k? from (1.4.2) by scaling. Set u(z) = v(kx). Then equation 
1.A.2) becomes 


1.A.4) (A+ 1)v =0. 
We specialize to the case n = 2 and write 
fu Pu 
1A. Au==>=5+753. 
5) Ox? a Oy? 


For a number of special domains 2 C R?, such as circular domains, annular do- 
mains, angular sectors, and pie-shaped domains, it is convenient to switch to polar 
coordinates (r,6), related to (x, y)-coordinates by 


(1.A.6) x=rcosé, y=rsind. 


In such coordinates, 


o 10 1 @ Jo. 


ae) aS ce 7, r Or ss r2 062 
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A special class of solutions to (1.4.4) has the form 


1.A.8) v= wre”. 
By (1.4.7), for such v, 
@w  1dw v? iv 
1.A.9) (A+ 1)v = + mae? (1 <)w]e : 
so (1.4.4) holds if and only if 
@w  idw v? 
1.A.10) aetsae (1 “3 )w 0. 


This is Bessel’s equation (1.16.1) (with different variables). 


Note that if v solves (1.4.4) on Q C R? and if Q is a circular domain or an 
annular domain, centered at the origin, then vy must be an integer. However, if Q 
is an angular sector or a pie-shaped domain, with vertex at the origin, v need not 
be an integer. 


In n dimensions, the Laplace operator (1.4.3) can be written 
a n-10 = 1 
+ +A )o, 
Or? r Or re 8)? 
where Ag is a second order differential operator acting on functions on the unit 


sphere $”~1 C R”, called the Laplace-Beltrami operator. Generalizing (1.A.8), one 
looks for solutions to (1.4.4) of the form 


1.4.12) v(x) = w(r)v(w), 
where «= rw, r € (0,00), w€ $"~1. Parallel to (1.4.9), for such v, 


(1.A.11) Av = ( 


fw n—ldw v? 
1.A.13) (A+ljv= Fe + em (1 3) 0] o( ), 
provided 
1.A.14) As = —v?y). 


The equation 


’w n—l1dw v 
1.A.15 ! + (1-5 )w=0 
) dr? r dr pa)” 
is a variant of Bessel’s equation. If we set 
1.4.16) g(r) =r"? w(r), 
then (1.A.15) is converted into the Bessel equation 
@p l1dy iT n—2\2 
L.A (1 ) ee or: ( ) 
i) dr? or dr 7. ra)? Me ESTE 2 


The study of solutions to (1.4.14) gives rise to the study of spherical harmonics, 
and from there to other special functions, such as Legendre functions. 

The search for solutions of the form (1.4.12) is a key example of the method 
of separation of variables for partial differential equations. It arises in numerous 
other contexts. Here are a couple of other examples: 


(1.4.18) (A —|a/? +k?)u =0, 
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and 


(1.4.19) (A+ 5 +H)u=0 


The first describes the n-dimensional quantum harmonic oscillator. The second (for 
n = 3) describes the quantum mechanical model of a hydrogen atom, according to 
Schrédinger. Study of these equations leads to other special functions defined by 
differential equations, such as Hermite functions and Whittaker functions. 


Much further material on these topics can be found in books on partial differ- 
ential equations, such as [45] (particularly Chapters 3 and 8). 


1.B. Euler’s gamma function 


We saw in (1.16.13) the need for a function I'(z) satisfying 
(1.B.1) T(z4+ 1) = 2F(z). 
Here we produce a function that has this property, namely 
co 
(1.B.2) T(z) =i e't?-ldt, for z>0. 
0 
To check (1.B.1) for z > 0, we apply integration by parts. 


I(z+1)= | e't* dt 
0 


(.B3) --| (Sete dt 


= T(2), 
since dt? /dt = zt?—!. 
The integral (1.B.2) is readily evaluated for z = 1, yielding 


(1.B.4) ray S4 
Then repeated use of (1.B.3) gives 
(1.B.5) T(k+1)=k!, for ke Zt. 


There is also a useful formula for ['(1/2), given by 


(1.B.6) 2 | er 
0 


the last identity by (1.2.26). Then repeated use of (1.B.3) gives 


er ee 


(2k)! 
owe 


(1.B.7) 
es 9—2k 
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Having (1.B.1), we can extend I'(z) to be well defined and smooth on the larger 
set R \ {0,-1,—2,...}. To see this, rewrite (1.B.1) as 
1 
(1.B.8) T(z) = -I(z+1). 
Zz 


Having I'(z) defined and smooth on z € (0,00), by (1.B.2), we see that the right 
side of (1.B.8) is defined and smooth for z € (—1,00), except for a pole at z = 0. 
This extends I'(z) to z € (—1,00) \ {0}. Then the right side of (1.B.8) is defined 


and smooth for z € (—2, 00), except for poles at z = 0 and z = —1. This argument 
can be continued. Let us further note that, by (1.B.2), 
(1.B.9) T(z) >0 for z>0, 
so 1/I'(z) is defined and smooth for z € (0,00). Rewriting (1.B.8) as 
1 
(1.B.10) eee 


T(z) T(e+1) 
and arguing as above, we have 1/T'(z) defined and smooth for all z € R, vanishing 
precisely for z € {0,—1,—2,...}. 
We derive another identity that is useful for the treatment of Bessel functions 
in §1.16, involving the beta function B(x, y), defined for x,y > 0 by 


B(a,y) = if. stl(1—s)¥-1 ds 
(1.B.11) 7 


= (1+ u)-? Yu?! du, 
0 


the latter identity via the change of variable u = s/(1—s). Our asserted identity is 


T(a)C 
1.B.12 Beaw= Day) 
Ta+y) 
To prove this, note that since 
1.B.13 I(z)p* = | ele dts 
0 
we have 
1 foe} 
1.B.14 igre on f (bute tyd gp 
Sk Te+y) Jo ° 
so 
1 <. oo 
B(z,y) = oy! a aa e ut! dudt 
T(x+y) Jo 0 
T(z) i Pete 
1.B.15 — 1) fl o-tyy-1 
T(r+y) Jo 
_ P@rw) 
T(x+y) 


as asserted. 
For closer contact with (1.16.38), note that setting s = ¢? in (1.B.11) gives 


1 
(1.B.16) )=2f teal yet dt, 
0 
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so, ifk € Z* and v > -1/2, 


1 1 ee 2\v-1/2 
(1.B.17) B(k+ sits) =f PhP de. 


There is much more that can be said about the gamma function, such as that 
it extends to C \ {0,—1,—2,...}, with 1/['(z) defined and smooth for all z € C 
(which permits one to define J,(z) for complex v). We refer the reader to [28], §4.3 
of [47], or Chapter 3, Appendix A of [45], for further material. We mention the 
following identity, of use in (1.16.33), whose proof can be found in these references: 


(1.B.18) PY) —v) = 


sin Ty" 
Note that both sides are defined and smooth for v € R \ Z, with singularities on Z. 


1.C. Differentiating power series 


Here we establish continuity and differentiability properties for a power series 


co 
(1.C.1) FOSS ait®. 

k=0 
We allow the coefficients a; to be complex numbers. To start, we assume this series 
converges for some nonzero t = to. This implies that the terms in this series are 
uniformly bounded for t = to, 


1.C.2) lagth| < B<oo, Vk. 
The following result establishes convergence for all smaller |t]. 


Proposition 1.C.1. Given (1.C.2), the series (1.C.1) converges absolutely for 


Proof. Pick S' € (0,7), and assume |t| < S. Then 
Sy k 
C. Kl <ja,T*|(—=) < Br* 
1.C.3) jaxt”| < |a,T (2) < Br", 
where r = S/T € (0,1). Hence, for each n EN, if |t| < S, 


n n 
1.C.4) So laxt*| < BYOr*. 
k=0 k=0 


Now we can evaluate the geometrical series on the right, 


n nt+1 
Si= re >PrS, = Be rk 
k=0 k=1 
(1.0.5) => (1l-r)S, =1- prt 
es n+1 
= S, = ——_ 
l-r 
Consequently, 
0<r<1lsr™t\0 as n> 00 
(1.C.6) 


1 
Sn as N—> oo. 
lr 
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This establishes the asserted absolute convergence. 


Similar arguments also lead to the following. 


Proposition 1.C.2. In the setting of Proposition 1.C.1, if0< S <T, the series 
(1.C.1) converges uniformly on |t| < S. 


Proof. For each n € N, write 


f()= > ayt® + ye ant® 
k=0 


k=n+1 
= S,(t) + R(t). 
The claim is that S,,(t) > f(¢), uniformly on |t| < S. Indeed, for |t| < S, 


Co 


|Rn(t)| < > Jaxt*| 


k=n+1 


(1.C.7) 


(1.C.8) k=n+1 


yielding |R,,(t)| > 0 uniformly for |t| < S. 


Before continuiug our study of the power series (1.C.1), we pause to note that 
calculations above involving the geometric series (1.C.8) enable us to establish the 
following result, known as the ratio test. 


Proposition 1.C.3. Let a, € C and assume there exist N < co andr <1 such 
that 
(1.0.9) k>N= “| 


ak 


<F 


Then the series \>,59 ak is absolutely convergent. 


Proof. From (1.C.9) we have, by induction, 


(1.C.10) lanse| <r*lay]. 
Hence 
foe} [oe} 
Yi lawsel < lawl DOr" 
(1.C.11) 1=0 (=0 
= lan| 
“4 |-r 


This yields absolute convergence. 


We now state the main result of this appendix. 
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Proposition 1.C.4. If the power series (1.C.1) converges for |t| < R, then f is 
differentiable in t € (—R, R), and, for such t, 


1.0.12) f= S kagt®—?. 


Proof. It suffices to show that (1.C.12) holds for |t| < S', for each S < R. Pick T € 
S, R), and note that the estimate (1.C.3) holds, when |t| < S$, with r = S/T <1. 
Hence, for |t| < S, 


A 


k Sy k-1 
kagt®-*| < S\a,T* (=) 
[kagt | S plexT" I 
B 
< perk. 
Now the ratio test applies to edt kr®-!, given r < 1, so the series 


1.C.13) 


A 


(1.C.14) g(t) = 3 kayt®—+ 


is absolutely convergent, and also uniformly convergent, for |¢| < S. It remains to 
show that g(t) = f’(t) for |t| < S, or equivalently that 


(1.0.15) ‘: g(s) ds = f(t) — f(0). 


This is a consequence of the following result. 


Proposition 1.C.5. Given by € C, assume 


(1.C.16) g(t) = S> bat 
k=0 
is absolutely convergent, for |t| < R. Then, for |t| < R, 
(1.C.17) ve (e)de= 57 Ok, 
C. : g(s) ds sk +1 


Proof. It is elementary that the series on the right side of (1.C.17) converges for 
|t| < R. Call the sum F(t). As before, pick S << T < R. For n EN, write 


= Dont + s byt 


k=n+1 
= me ) + Rn(t). 
As in Proposition 1.C.2, we have g,(t) + g(t) and R,(t) > 0, uniformly for |t| < S, 


especially 


(1.0.19) tis |Rn(t)| < en > 0. 


(1.C.18) 


Clearly, for |t| < R, 


(1.C.20) | Gn(s) ds = 
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as n — oo. Meanwhile, 
t 
(1.C.21) Lf R,,(s) ds| < Rep. 
0 
Taking n — oo in (1.C.18)-(1.C.21) yields 


(1.0.22) | g(s) ds = F(t), 


as asserted. This proves Proposition 1.C.5, so we have Proposition 1.C.4. 


Having (1.C.12), we can iterate, computing the derivative of f’(t), as 


(1.0.23) f'®= > k(k — l)ax ph? 
k=2 
and so on, 
(1.0.24) F(4) = So Mk 1) (b— nt Dag”, 
k=n 


In particular, 


f (0) 


(1.0.25) f(0) =nlan, hence an = 
n! 


We have the following. 


Proposition 1.C.6. If f(t) is given by a convergent power series (1.C.1) for |t| < 
T, T > 0, then 


oO ¢(k) 
(1.C.26) fO= 5; f aw t*. 
k=0 , 


Frequently, one can turn this around, take a function f : (—T,T) > R, compute 
f (0), and investigate whether (1.C.26) holds. Here is an important class of 
functions for which this works. Take r € R, and set 
(1.0.27) f®o®=a-t. 
We have 


(1.C.28) 
FO) =r(r41)--(rtn-Y(L-a 

hence 

(1.C.29) f(0) =r(r+1)---(r+n—1). 


Claim. For r € R, we have 


(1.C.30) er ue 
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In other words, (1.C.1) holds with 
r(r+1)---(r+k-1) 


1.C.31) a 
Note that 
k 
1.C.32) Oey1 = —- an, 


so the ratio test implies that the right side of (1.C.30) is absolutely convergent for 
t| <1, i-e., we have a well defined continuous (and differentiable) function 


1.C.33) g(t) aS" vee k= 1) x 
k=0 : 


Our claim is therefore that 
1.0.34) g(t) =(1-t)-”. 
One approach to this is to estimate the remainder R,,(t) in the expansion 


“ f™(0) 
=e kl 


1.0.35) f(t) t* + Ra(t). 
k=0 
A discussion of this appears in §4.3 of [50]. Here is another approach. We can 


apply Proposition 1.C.4 to g(t) to obtain 


(1.C.36) (1 8)g'(t) = ra), 
and then calculate 
S(1-1'a(t) = (0-89) — r( - 29) 
vee =(1-a {= H9'() - ra} 
=0, 


and deduce (1.C.34), hence (1.C.30). 
For an application of (1.C.30), with r = 1/2, see (1.6.60). 


REMARK. Note the parallel between the use of (1.C.37) to prove (1.C.30) and the 
use of (1.1.10) to prove (1.1.13). 


ne 
Chapter 2 


Linear algebra 


The purpose of this chapter is to provide sufficient background in linear algebra for 
understanding the material of Chapter 3, on linear systems of differential equations. 
Results here will also be useful for the development of nonlinear systems in Chapter 
4, 


In §2.1 we define the class of vector spaces (real and complex) and discuss some 
basic examples, including R” and C”, or, as we denote them, F”, with F = R or C. 
In §2.2 we consider linear transformations between such vector spaces. In particular 
we look at an m x n matrix A as defining a linear transformation A: F" > F™. We 
define the range R(T) and null space N’(T) of a linear transformation T: V > W. 
In §2.3 we define the notion of basis of a vector space. Vector spaces with finite 
bases are called finite dimensional. We establish the crucial property that any two 
bases of such a vector space V have the same number of elements (denoted dim 
V). We apply this to other results on bases of vector spaces, culminating in the 
fundamental theorem of linear algebra, that if T : V — W is linear and V is finite 
dimensional, then dim V(T)+dim R(T) = dim V, and discuss some of its important 
consequences. 


A linear transformation T : V — V is said to be invertible provided it is one- 
to-one and onto, i.e., provided N(T) = 0 and R(T) = V. In §2.5 we define the 
determinant of such T, det T (when V is finite dimensional), and show that T is 
invertible if and only if det T 4 0. In §2.6 we study eigenvalues \; and eigenvectors 
v; of such a transformation, defined by Tv; = A;v;. Results of §2.5 imply A; is a 
root of the characteristic polynomial det(AI — T). Section 2.7 extends the scope of 
§2.6 to a treatment of generalized eigenvectors. This topic is connected to properties 
of nilpotent matrices and triangular matrices, studied in §2.8. 


In §2.9 we treat inner products on vector spaces, which endow them with a 
Euclidean geometry, in particular with a distance and a norm. In §2.10 we discuss 
two types of norms on linear transformations, the operator norm and the Hilbert- 
Schmidt norm. Then, in §$2.11-2.12, we discuss some special classes of linear 
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transformations on inner product spaces: self-adjoint, skew-adjoint, unitary, and 
orthogonal transformations. 


Several appendices supplement the material of this chapter, with a treatment 
of the Jordan canonical form and Schur’s theorem on upper triangularization. This 
material is not needed for Chapter 3, but for the interested reader it provides a 
more complete introduction to linear algebra. (A great deal more on linear algebra 
can be found in [48].) The third appendix gives a proof of the fundamental theorem 
of algebra, that every nonconstant polynomial has complex roots. This result has 
several applications in §§2.6—2.7. 


2.1. Vector spaces 


The reader is most likely familiar with vectors in the plane R? and 3-space R*. More 
generally we have n-space R”, whose elements consist of n-tuples of real numbers: 


2.1.1 v = (U1,..-,Un). 
There is vector addition; if also w = (wi,...,Wn) € R”, 
2.1.2 v+tw = (v1 +W1,---,Un + Wn). 


There is also multiplication by scalars; if a is a real number (a scalar), 
2.1.3 av = (av1,..., Un). 


We could also use complex numbers, replacing R” by C”, and allowing a € C in 
2.1.3). We will use F to denote R or C. 


Many other vector spaces arise naturally. We define this general notion now. 
A vector space over F is a set V, endowed with two operations, that of vector 
addition and multiplication by scalars. That is, given v,w € V anda € F, then 
uv + w and av are defined in V. Furthermore, the following properties are to hold, 
for all u,v,w € V, a,b € F. First there are laws for vector addition: 


2.1.4 Commutative law: utv=vu+4, 

2.1.5 Associative law: (utv)+w=ut(vt+w), 
2.1.6 Zero vector: 40€V, v+0=2, 

QT Negative : d-v, v+(-v) =0. 


Next there are laws for multiplication by scalars: 


2.1.8 Associative law: a(bv) = (ab)v, 
2.1.9 Unit: l-v=v. 


Finally there are two distributive laws: 


2.1.10) a(u+v) = aut+av, 
2.1.11) (a+ b)u 


au + bu. 


It is easy to see that R” and C” satisfy all these rules. We will present a number 
of other examples below. Let us also note that a number of other simple identities 
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are automatic consequences of the rules given above. Here are some, which the 
reader is invited to verify: 


v+w=v>w=), 


0-v=0, 

(2.1.12) v+tw=0>w=-2, 
v+(-l1)v=0 v=0, 
—l)v=-v 


Above we represented elements of F” as row vectors. Often we represent ele- 
ments of F” as column vectors. We write 


V1 avy +r Wy 
(2.1.13) v=|[:i], avtw= 


Un avn + Wn 


We give some other examples of vector spaces. Let I = [a,b] denote an interval 
in R, and take a nonnegative integer k. Then C*(I) denotes the set of functions 
f :1— F whose derivatives up to order k are continuous. We denote by P the set 
of polynomials in x, with coefficients in F. We denote by P, the set of polynomials 
in x of degree < k. In these various cases, 


(2.1.14) (f+ 92) = fle) +9(2), (af)(x) = af(e). 


Such vector spaces and certain of their linear subspaces play a major role in the 
material developed in these notes. 


Regarding the notion just mentioned, we say a subset W of a vector space V 
is a linear subspace provided 


(2.1.15) w; CW, aj CF ayw, + agw2 € W. 


Then W inherits the structure of a vector space. 


eS ese 
Exercises 


1. Specify which of the following subsets of R® are linear subspaces: 
(a) {(x,y,2) sry =O}, 

(0) :e+y = 0}, 

( :2>0, y=0, z=0}, 


:@ is an integer}, 


c) 
d) 
d) 12 =2z, y= —z}. 


2. Show that the results in (2.1.12) follow from the basic rules (2.1.4)—(2.1.11). 
Hint. To start, add —v to both sides of the identity v + w = v, and take account 
first of the associative law (2.1.5), and then of the rest of (2.1.4)-(2.1.7). For the 
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second line of (2.1.12), use the rules (2.1.9) and (2.1.11). Then use the first two 
lines of (2.1.12) to justify the third line... 


3. Demonstrate the following results for any vector space. Takea € F, ve V. 
a-0=0€V, 
a(—v) = —av. 


Hint. Feel free to use the results of (2.1.12). 


Let V be a vector space (over F) and W,X C V linear subspaces. We say 


(2.1.16) V=W+X 
provided each v € V can be written 

(2.1.17) v=wta, weEw, xEeX. 
We say 

(2.1.18) V=WOoX 


provided each v € V has a unique representation (2.1.17). 


4. Show that 


V=WEX V=W+X and WX =0. 


5. Take V = R®. Specify in each case (a)-(c) whether V = W + X and whether 
V=WexX. 


6. If Wi,...,Wm are linear subspaces of V, extend (2.1.16) to the notion 
(2.1.19) V=W,4+-:-+W mn, 

and extend (2.1.18) to the notion that 

(2.1.20) V=W.6-:-OWm. 


2.2. Linear transformations and matrices 


If V and W are vector spaces over F (R or C), a map 


(2.2.1) T:V—W 
is said to be a linear transformation provided 
(2.2.2) T(a1v1 + agv2) = a1T vy +a2Tv2, Vaz €F, vj € V. 


We also write T € L(V,W). In case V = W, we also use the notation L(V) = 
L(V,V). 
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Linear transformations arise in a number of ways. For example, an m x n 
matrix A with entries in F defines a linear transformation 


(2.2.3) A:F" —>F™ 
by 
a1 ++ Gin by Larebe 
(2.2.4) : : a : 
ie eu hie Dn Semebe 


We also have linear transformations on function spaces, such as multiplication 
operators 


2.2.5 M;:C*(1) + C*(D), Meg(a) = f(x)g(2), 
given f € C*(I), I = [a,d], and the operation of differentiation, 
2.2.6 D:C®(1) + C8(D, Df a) = f'(2). 


We also have integration, 

2.2.7 T:08() SCD, Tf(«2) = / f(y) dy. 
Note also that 

2.2.8 D:Pr41 — Pr, LT: Pr — Pr+i, 


where P;, denotes the space of polynomials in x of degree < k. 
Two linear transformations T; € £(V,W) can be added, 


(2.2.9) Ti + To :Vo W, (Ti + T2)v = Tv + Tov. 
Also T € £L(V,W) can be multiplied by a scalar, 
(2.2.10) aT: V —+W, (aT)v =a(Tv). 


This makes £(V,W) a vector space. 
We can also compose linear transformations S € L(W,X), T € L(V,W), 


(2.2.11) ST:V—>xX, (ST)v=S(Tv). 
For example, we have 
(2.2.12) M;D:C*8**(1) + C81), MyDg(x) = f(a)g'(2), 
given f € C*(I). When two transformations 
(2.2.13) A:F"—>F™, B:FF¥— 4 F" 
are represented by matrices, e.g., A as in (2.2.4) and 
Diu bik 
(2.2.14) B= Dy, 
Dnt Bris 
then 


(2.2.15) AB: FY —,F™ 
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is given by matrix multiplication, 

Layebe, +++ Nareber 

(2.2.16) AB = 

Lameber +++ Ldmeder 

For example, 

(2.2.17) & ( 2) a ae + aigbo1  aiidi2 + oS) ; 
az1 G22) \beai bee a21b11 + G22b21  dgidi2 + azabo2 

Another way of writing (2.2.16) is to represent A and B as 

(2.2.18) A=(aij), B= (bij) 


and then we have 
(2.2.19) AB= (di), di; = So andes. 
t=1 


To establish the identity (2.2.16), we note that it suffices to show the two sides have 
the same effect on each e; € F‘, 1 <j <-k, where e; is the column vector in F* 
whose jth entry is 1 and whose other entries are 0. First note that 


b1; 
2.2.20) Be;s=| : |, 

bnj 
he jth column in B, as one can see via (2.2.4). Similarly, if D denotes the right 
side of (2.2.16), De; is the jth column of this matrix, i.e., 

Lay ebe; 

2.2.21 De; = : 
Lamebe; 
On the other hand, applying A to (2.2.20), via (2.2.4), gives the same result, so 
2.2.16) holds. 


Associated with a linear transformation as in (2.2.1) there are two special linear 
spaces, the null space of T and the range of T. The null space of T is 


2.2.22 N(T) ={vEV:Tv=0}, 
and the range of T is 
2.2.23 R(T) = {Tv:veEV}. 


Note that N(T) is a linear subspace of V and R(T) is a linear subspace of W. If 
N(T) = 0 we say T is injective; if R(T) = W we say T is surjective. Note that T 
is injective if and only if T is one-to-one, i.e., 


(2.2.24) Tv, = Tv, => v1 = v2. 


If T is surjective, we also say T is onto. If T is one-to-one and onto, we say it is an 
isomorphism. In such a case the inverse 


(2.2.25) T1:W—V 
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is well defined, and it is a linear transformation. We also say T is invertible, in such 
a case. 


|} 
Exercises 


1. With D and T given by (2.2.6)—(2.2.7), compute DT and ID. 
2. In the context of Exercise 1, specify V(D), N(Z), R(D), and R(Z). 


3. Consider A, B : R? > R?, given by 

0 1 0 0 0 0 

A={0 0 1 B={1 0 0 

0 0 0 0 1 0 
Compute AB and BA. 


4. In the context of Exercise 3, specify 


N(A), N(B), R(A), R(B). 


5. We say two n x n matrices A and B commute provided AB = BA. Note that 
AB # BA in Exercise 3. Pick out the pair of commuting matrices from this list: 


Go) G 4). G 7). 


6. Show that (2.2.4) is a special case of matrix multiplication, as defined by the 
right side of (2.2.16). 


7. Show, without using the formula (2.2.16) identifying compositions of linear trans- 
formations and matrix multiplication, that matrix multiplication is associative, i.e., 


(2.2.26) A(BC) = (AB)C, 


where C : Ff > F* is given by a k x @ matrix and the products in (2.2.26) are 
defined as matrix products, as in (2.2.19). 


8. Show that the asserted identity (2.2.16) identifying compositions of linear trans- 
formations with matrix products follows from the result of Exercise 7. 
Hint. Formula (2.2.4), defining the action of A on F”, is a matrix product. 


9. Let A: F” > F™ be defined by an m x n matrix, as in (2.2.3)—(2.2.4). 

(a) Show that R(A) is the span of the columns of A. 

Hint. See (2.2.20). 

(b) Show that (A) = 0 if and only if the columns of A are linearly independent. 
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10. Define the transpose of an m x n matrix A = (aj,) to be the n x m matrix 
At = (ay;). Thus, if A is as in (2.2.3)-(2.2.4), 


Qiao": am 
(2.2.27) At= : 
Gin *"* Gmn 
For example, 
1 2: 
A=|3 4 aaa ; aE 
5 6 


Suppose also B is an n x k matrix, as in (2.2.14), so AB is defined, as in (2.2.15). 
Show that 


(2.2.28) (AB) = BYAt. 


11. Let 


Compute AB and BA. Then compute A’B* and Bt A!. 


2.3. Basis and dimension 


Given a finite set S = {v,..., vz} in a vector space V, the span of S' is the set of 


vectors in V of the form 


(2.3.1) C1Uy +++ + CRUE, 


with c; arbitrary scalars, ranging over F = R or C. This set, denoted Span(S) is a 
linear subspace of V. The set S is said to be linearly dependent if and only if there 
exist scalars ¢1,...,C,, not all zero, such that (2.3.1) vanishes. Otherwise we say S' 
is linearly independent. 

If {v1,..., vg} is linearly independent, we say S is a basis of Span(S’), and that 
k is the dimension of Span(S). In particular, if this holds and Span(S) = V, we 
say k =dimV. We also say V has a finite basis, and that V is finite dimensional. 

By convention, if V has only one element, the zero element, we say V = 0 and 
dim V = 0. 

It is easy to see that any finite set S = {v1,...,vx%} C V has a maximal subset 
that is linearly independent, and such a subset has the same span as S, so Span(S) 
has a basis. To take a complementary perspective, S will have a minimal subset So 
with the same span, and any such minimal subset will be a basis of Span(.$). Soon 
we will show that any two bases of a finite dimensional vector space V have the 
same number of elements (so dim V is well defined). First, let us relate V to F*. 


So say V has a basis S = {v1,...,v%}. We define a linear transformation 


(2.3.2) Js: FF 3 V 
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by 
(2.3.3) TIs(crer +++ + chen) = c1¥1 ++++ + cpuR, 
where 
1 0 
0 : 
(2.3.4) CS Ni geeweeae see = ]° 
: 0 
0 1 


We say {e1,...,e,} is the standard basis of F*. The linear independence of 9 is 
equivalent to the injectivity of Zs and the statement that S spans V is equivalent 
to the surjectivity of Zs. Hence the statement that S' is a basis of V is equivalent 
to the statement that Zs is an isomorphism, with inverse uniquely specified by 


(2.3.5) Is (carr fe +CKUR) = Cre, + +++ + Cpen. 


We begin our demonstration that dimV is well defined, with the following 
concrete result. 


Lemma 2.3.1. Jf v1,...,vz41 are vectors in F*, then they are linearly dependent. 


Proof. We use induction on k. The result is obvious if k = 1. We can suppose the 
last component of some v; is nonzero, since otherwise we can regard these vectors 
as elements of F*—! and use the inductive hypothesis. Reordering these vectors, we 
can assume the last component of vz41 is nonzero, and it can be assumed to be 1. 
Form 
Wj = Uj —URjURG1, LSI SK, 

where vj = (v1j;,--- ,Ugj)! Then the last component of each of the vectors w1,..., Wx 
is 0, so we can regard these as k vectors in F*—!. By induction, there exist scalars 
a1,..-,@,, not all zero, such that 


ayW1 +--+ +apw, = 0, 
so we have 


avy +++ + apup = (arg +++ + aKUER) URL, 


the desired linear dependence relation on {v1,...,vx+1}- 


With this result in hand, we proceed. 


Proposition 2.3.2. IfV has a basis {v1,...,vx} with k elements and {wy,...,we} C 
V is linearly independent, then € < k. 


Proof. Take the isomorphism Js : F* + V described in (2.3.2)-(2.3.3). The hy- 
potheses imply that {Jgivn, Ace Tg we} is linearly independent in F*, so Lemma 
2.3.1 implies € < k. 


Corollary 2.3.3. If V is finite-dimensional, any two bases of V have the same 
number of elements. If V is isomorphic to W, these spaces have the same dimen- 
sion. 
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Proof. If S (with #5 elements) and T are bases of V, we have #S < #T and 
#T < #5, hence #5 = #T. For the latter part, an isomorphism of V onto W 
takes a basis of V to a basis of W. 


The following is an easy but useful consequence. 


Proposition 2.3.4. If V is finite dimensional and W C V is a linear subspace, 
then W has a finite basis, and dimW < dimV. 


Proof. Suppose {wy ,..., we} is a linearly independent subset of W. Proposition 
2.3.2 implies @ < dim V. If this set spans W, we are done. If not, there is an element 
we+1 € W not in this span, and {w ,...,we41} is a linearly independent subset of 
W. Again €+1 < dimV. Continuing this process a finite number of times must 
produce a basis of W. 


A similar argument establishes: 


Proposition 2.3.5. Suppose V is finite dimensional, W C V a linear subspace, and 
{w1,...,we} a basis of W. Then V has a basis of the form {wi,..., We, U1,.--,Um}, 
and£+m=dimV. 


Having this, we can establish the following result, sometimes called the funda- 
mental theorem of linear algebra (and sometimes the rank-nullity theorem). 


Proposition 2.3.6. Assume V and W are vector spaces, V finite dimensional, 
and 


(2.3.6) A:V—W 
a linear map. Then 
(2.3.7) dim NV(A) + dim R(A) = dim V. 


Proof. Let {w1,...,we} be a basis of (A) C V, and complete it to a basis 
{wi,--+,We, Us, -.+,Um} 


of V. Set L = Span{ui,...,Um}, and consider 


(2.3.8) Ag: L—+W, Ao = Al,. 

Clearly w € R(A) > w = A(aywi +--+ + aewe + biu1 + +++ + dmUm) = Ao(brur + 
+++ + DmUm), SO 

(2.3.9) R(Ao) = R(A). 

Furthermore, 

(2.3.10) N(Ao) = N(A)NL =0. 


Hence Ay : L — R(Ao) is an isomorphism. Thus dimR(A) = dimR(Ap) = 
dim L = m, and we have (2.3.7). 


The following is a significant special case. 
Corollary 2.3.7. Let V be finite dimensional, and let A: V + V be linear. Then 
(2.3.11) A injective <=> A surjective <=> A isomorphism. 
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We mention that these equivalences can fail for infinite dimensional spaces. 
For example, if P denotes the space of polynomials in x, then M, :P — P (where 
M,. f(x) = xf(x)) is injective but not surjective, while D: P > P (Df(x) = f'(x)) 
is surjective but not injective. 

Next we have the following important characterization of injectivity and sur- 
jectivity. 


Proposition 2.3.8. Assume V and W are finite dimensional and A: V > W is 
linear. Then 


2.3.12) A_ surjective = > AB = Iw, for some Be L(W,V), 
and 
2.3.13) A injective => CA=Iy, for some C € L(W,V). 


Proof. Clearly AB = I => A surjective and CA = I = A injective. We establish 
he converses. 

First assume A: V - W is surjective. Let {w1,...,we} be a basis of W. Pick 
v; € V such that Av; = w;. Set 


2.3.14) B(ayw, +--+ + aewe) = ayy + +++ + aeve. 
This works in (2.3.12). 

Next assume A: V — W is injective. Let {v1,...,vz¢} be a basis of V. Set 
w,; = Av;. Then {w1,..., wx} is linearly independent, hence a basis of R(A), and 
we then can produce a basis {w1,...,Wk,U1,---,Um} of W. Set 


(2.3.15) C(aywi +--+ + agwe + brut +++ + bmUm) = A101 +++ + agp. 
This works in (2.3.13). 


An m Xn matrix A defines a linear transformation A: F” — F”, as in (2.2.3)— 
(2.2.4). The columns of A are 


aj 
(2.3.16) a; = 
mj 
As seen in §2.2, 
2.3.17 Ae; = aj, 
where €),...,€, is the standard basis of F”. Hence 
2.3.18 R(A) = linear span of the columns of A, 
so 
2.3.19 R(A) =F" <> q,...,@n span F”. 
Furthermore, 
2.3.20 A(Seje;) <0 Desa; =0, 


j=l j=l 
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so 
(2.3.21) N (A) = 0 <=> {a1,...,@»} is linearly independent. 
We have the following conclusion, in case m = n. 
Proposition 2.3.9. Let A be ann x n matrix, defining A: F” > F”. Then the 
following are equivalent: 

A is invertible, 
(2.3.22) the columns of A are linearly independent, 


the columns of A span F”. 


—————————S—S————— 
Exercises 


1. Suppose {v1,...,v%} is a basis of V. Show that 
Wi =U, We=VU +2, -.. Wi SHV FUG, «| Wk = V1 ++ + UR 


is also a basis of V. 


2. Let V be the space of polynomials in x and y of degree < 10. Specify a basis of 
V and compute dimV. 


3. Let V be the space of polynomials in x of degree < 5, satisfying p(—1) = p(0) = 
p(1) =0. Find a basis of V and give its dimension. 


4. Assume the existence and uniqueness result stated at the beginning of §1.17 
in Chapter 1. Let a; be continuous functions on an interval J, with a, nowhere 
vanishing. Show that the space of functions « € C\™) (Z) solving 


an (t)e (t) +--+ a; (t)a"(t) + ao(t)e(t) = 0 


is a vector space of dimension n. 


5. Denote the space of m x n matrices with entries in F (as in (2.2.4)) by 


(2.3.23) M(m x n,F). 
If m =n, denote it by 

(2.3.24) M(n,F). 
Show that 


dim M(m x n,F) = mn, 


especiall 
: : dim M(n,F) = n?. 


6. If V and W are finite dimensional vector spaces, n = dimV, m = dim W, what 
is dim L(V, W)? 
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Let V be a finite dimensional vector space, with linear subspaces W and X. Recall 
the conditions under which V = W+X or V =W@X, from §2.1. Let {w1,..., we} 
be a basis of W and let {a1,...,a¢} be a basis of X. 


7. Show that 
V=W+X <> {w,...,we,t1,..-,Ue} spans V 
V=WOX <> {w,...,we,21,..., 22} is a basis of V. 
8. Show that 


V=W+X dimW + dim X > dim V, 
V=WOxX WnxX =0 and dimW+dimX =dimV. 


9. Produce variants of Exercises 7-8 involving V = W, +---+ W, and V = 
Wi ®---@® Wm, as in (2.1.19)—(2.1.20). 


2.4. Matrix representation of a linear transformation 


We show how a linear transformation 
(2.4.1) T:V—7W 


has a representation as an m x n matrix, with respect to a basis S = {v1,...,Un} 
of V and a basis © = {w1,...,Wm} of W. Namely, define a;; by 


(2.4.2) Tv; = SS asjwi, 1l<j<n. 


The matrix representation of T with respect to these bases is then 
@i1 SR Gin 

(2.4.3) A= 
Gml *"° Amn 


Note that the jth column of A consists of the coefficients of T’v;, when this is 
written as a linear combination of wi,...,Wm. Compare (2.2.20). 


If we want to record the dependence on the bases S and %, we can write 
(2.4.4) A=M3(T) = Jg'TIs :F" + F", 


given the isomorphism Js : F” > V as in (2.3.2)—(2.3.3) (with n instead of k) and 
its counterpart Jy : F™ — W, and with the identification of A with a matrix as in 
(2.2.3)—(2.2.4). 


The definition of matrix multiplication is set up precisely so that, if X is a vector 
space with basis T = {a,...a,} and U: X — V is linear, then TU : X > W has 
matrix representation 


(2.4.5) M2(TU) = AB, B= M2&(U). 
Indeed, if we complement (2.4.4) with 
(2.4.6) B= Jg'UTp = MR(U), 
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we have 
(2.4.7) AB = Jg'(TU) Ir. 


As for the representation of AB as a matrix product, see the discussion around 
(2.2.15)—(2.2.21). 


For example, if 
(2.4.8) T:V—YV, 


and we use the basis S of V as above, we have an n x n matrix M2(T). If we pick 
another basis S = {01,...,0,} of V, it follows from (2.4.5) that 


2.4.9) M3(T) = M3(DM§(T)M&(J). 

Here 

2.4.10 M3 (1) = Ig Tg = C = (cig), 

where 

2411 1 = S cegis. Leen, 
i=1 

and we see (via (2.4.5)) that 

2.4.12 M3(1) =C7}. 


To rewrite (2.4.9), we can say that if A is the matrix representation of T with 
respect to the basis S and A the matrix representation of T’ with respect to the 
basis S, then 


(2.4.13) A=C"lAC. 


REMARK. We say that n x n matrices A and A, related as in (2.4.13), are similar. 


EXAMPLE. Consider the linear transformation 
2.4.14 D:P,—+P2, Df(x) = f'(z). 
With respect to the basis 


2.4.15 v=1, w=2, v3=2", 


D has the matrix representation 


0 1 0 
2.4.16 A=10 0 2], 

0 0 0 
since Dv, = 0, Dve = v1, and Dv3 = 2v2. With respect to the basis 
2.4.17 o=1, t2=l+a, tj =1+ae+2, 


D has the matrix representation 


0 1 
2.4.18 A=|0 0 2 
0 0 
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since Di, = 0, Dvg = 01, and Dv3 = 1+ 2x = 2 — v1. The reader is invited to 
verify (2.4.13) for this example. 


Sl 
Exercises 


1. Consider T : P2 + Po, given by Tp(a) = «71 So p(y) dy. Compute the matrix 
representation B of TJ with respect to the basis (2.4.15). Compute AB and BA, 
with A given by (2.4.16). 


2. In the setting of Exercise 1, compute DT and 7D on P, and compare their 
matrix representations, with respect to the basis (2.4.15), with AB and BA. 


3. In the setting of Exercise 1, take a € R and define 


(2.4.19) Tap(@) = [aes Ta : P2 —> Po. 


L-a 
Compute the matrix representation of J, with respect to the basis (2.4.15). 


4. Compute the matrix representation of J,, given by (2.4.19), with respect to the 
basis of P2 given in (2.4.17). 


5. Let A: C? > C? be given by 


ete 


(with respect to the standard basis). Find a basis of C? with respect to which the 
matrix representation of A is 

Pe 0 1 

Aan) 


6. Let V = {acost + bsint : a,b € C}, and consider 


ey ee 
dt 


Compute the matrix representation of D with respect to the basis {cos¢, sin t}. 


7. In the setting of Exercise 6, compute the matrix representation of D with respect 
to the basis {e,e~"}. 
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2.5. Determinants and invertibility 


Determinants arise in the study of inverting a matrix. To take the 2 x 2 case, solving 
for x and y the system 


ax + by = u, 
(2.5.1) 


ce +dy=v 


can be accomplished by multiplying these equations by d and 0, respectively, and 
subtracting, and by multiplying them by c and a, respectively, and subtracting, 
yielding 

ae (ad — be)x = du — bu, 

kare) (ad — bc)y = av — cu. 


The factor on the left is 


2.5.3 det (@ u = ad — be, 
c d 


and solving (2.5.2) for x and y leads to 


fab ee ee ae 
nes i arse, 


provided det A # 0. 


We now consider determinants of n x n matrices. Let M(n,F) denote the set 
of n x n matrices with entries in F = R or C. We write 


@i1 ++ Gin 
(2.5.5) A= : : = (a1,..-,@n); 
Qnl "** Ann 
where 
aj 
(2.5.6) a; = : 
Onj 


is the jth column of A. The determinant is defined as follows. 
Proposition 2.5.1. There is a unique function 

(2.5.7) 0: M(n,F) + F, 

satisfying the following three properties: 


(a) 9 is linear in each column a; of A, 
(b) 0(A) = —0(A) if A is obtained from A by interchanging two columns, 
(c) (1) =1. 


This defines the determinant: 
(2.5.8) (A) = det A. 
If (c) is replaced by 


2.5. Determinants and invertibility 103 


(c) WD) =r, 


then 
(2.5.9) 0(A) = rdet A. 


The proof will involve constructing an explicit formula for det A by following 
the rules (a)—(c). We start with the case n = 3. We have 


3 
(2.5.10) det A = Seay det(e;, a2, a3), 
j=l 
by applying (a) to the first column of A, a, = yy a;,e;. Here and below, {e; : 
1 <j <n} denotes the standard basis of F”, so e; has a 1 in the jth slot and 0s 
elsewhere. Applying (a) to the second and third columns gives 


3 
det A = y 471 Ax det(e;, ex, a3) 
jk=1 
3 
= y 441442403 det (e;, ex, ee). 
jkb=1 


(2.5.11) 


This is a sum of 27 terms, but most of them are 0. Note that rule (b) implies 
(2.5.12) det B =0 whenever B_ has two identical columns. 

Hence det(e;, ex, e¢) = 0 unless j,k, and @ are distinct, that is, unless (j,k, 0) is a 
permutation of (1,2,3). Now rule (c) says 

(2.5.13) det(e1, e2,e3) = 1, 


and we see from rule (b) that det(e;,e,,e¢) = 1 if one can convert (e;, ex, eg) to the 
triple (e1, €2, €3) by an even number of column interchanges, and det(e;,ex,e¢) = —1 
if it takes an odd number of interchanges. Explicitly, 


det(e1,e2,e3) =1, det(e1,e3,e2) = —1, 
(2.5.14) det(eg,€3,€1) =1, det(e2,e1,e3) = —1, 
det(e3,€1,€2) =1, det(e3,e2,e1) = —1. 
Consequently (2.5.11) yields 
det A = 11422033 — 411432423 


(2.5.15) + 421432413 — 421412033 


T 431412423 — 431422013- 


Note that the second indices occur in (1,2,3) order in each product. We can 
rearrange these products so that the first indices occur in (1, 2,3) order: 


det A = @11422433 — 11423432 


(2.5.16) + 413421032 — 412421033 


Tr @12423431 — €13422031- 
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Now we tackle the case of general n. Parallel to (2.5.10)—(2.5.11), we have 
det A = So aj det (ej, @2,...,dn) =-- 
j 


(2.5.17) 

SSE aya vaya eileen ccrey,), 

Daseodn 

by applying rule (a) to each of the n columns of A. As before, the observation 
2.5.12) implies det(e;,,...,e;,.) = 0 unless (71,...,jn) are all distinct, that is, 
unless (j1,...,jn) is a permutation of the set (1,2,...,m). We set 
2.5.18) Si, = set of permutations of (1,2,...,n). 
That is, S;, consists of elements 7, mapping the set {1,...,n} to itself, 
2.5.19) o:{1,2,...,n} —> {1,2,...,n}, 
that are one-to-one and onto. We can compose two such permutations, obtaining 
he product or € S;,, given o and 7 in S,. A permutation that interchanges just 
two elements of {1,...,n}, say j and k (j 4 &), is called a transposition, and 
labeled (jk). It is easy to see that each permutation of {1,...,n} can be achieved 


by successively transposing pairs of elements of this set. That is, each element 
oa € S,, is a product of transpositions. We claim that 


(2.5.20) det (€o(1)1,-+++€o(n)n) = (sgna) det(e1,..., en) = sgno, 


where 


sgno = 1 if o is a product of an even number of transpositions, 


hee) —1 if ois a product of an odd number of transpositions. 


In fact, the first identity in (2.5.20) follows from rule (b) and the second identity 
from rule (c). 


There is one point to be checked here. Namely, we claim that a given o € S, 
cannot simultaneously be written as a product of an even number of transpositions 
and an odd number of transpositions. If o could be so written, sgno would not 
be well defined, and it would be impossible to satisfy condition (b), so Proposition 
2.5.1 would fail. One neat way to see that sgn o is well defined is the following. Let 
o € S, act on functions of n variables by 


2.5.22 (of \(@1,..-,%n) = f(@o(a)s ++ +s Lo(n)): 

It is readily verified that if also 7 € Sy, 

2.5.23 g=of = 1g =(ro)f. 

Now, let P be the polynomial 

2.5.24 P(m,.-.,0n)= [[ (aj - 22). 
1<j<k<n 


One readily has 


2.5.25 (oP)(«) =—P(x), whenever o is a transposition, 
and hence, by (2.5.23), 
2.5.26 (oP)(x) = (sgna)P(x), Voe Sn, 


and sgno is well defined. 
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The proof of (2.5.20) is complete, and substitution into (2.5.17) yields the 
formula 


(2.5.27) det A = > (sen o)a5(1)1°**Go(n)n- 
TESy 


It is routine to check that this satisfies the properties (a)—(c). Regarding (b), note 
that if #(A) denotes the right side of (2.5.27) and A is obtained from A by applying 


a permutation 7 to the columns of A, so A= (Gr(1))+++;@r(n)), then 
H(A) = $0 (sgne)ag(ayr(ay *** o(n)r(n) 
oESn 
= BS (sgn O)Agr-1(1)1 “** Agr-1(n)n 
(2.5.28) o€Sn 
= > (senwr)a.(1)1 °° * Gw(n)n 
weES, 
= (sgn) 0(A), 


the last identity because 
(2.5.29) sgnwT = (sgnw)(sgnT), Vw,7T € Sp. 
As for the final part of Proposition 2.5.1, if (c) is replaced by (c’), then (2.5.20) 
is replaced by 
(2.5.30) D(Eo()r+++>€a(n)) = T(sgne), 
and (2.5.9) follows. 


REMARK. The identity (2.5.27) is taken as a definition of the determinant by some 
authors. While it is a useful formula for the determinant, it is a bad definition, 
which has perhaps led to a bit of fear and loathing among math students. 


REMARK. Here is another formula for sgno, which the reader is invited to verify. 
Ifo € Sy, 


(2.5.31) sgno = (-1)*, 
where 
«(o) = number of pairs (j,k) such that 1<j<k<n, 


2.5.32 
( ) but o(j) > o(k). 
Note that 
(2.5.33) Ag(1)1 °° * Go(n)n = G17(1) °° * @nz(n)s with T= ie 
and sgng = seno™}, so, parallel to (2.5.16), we also have 
(2.5.34) det A = > (sen o)16(1)*** @no(n): 
oESn 


Comparison with (2.5.27) gives 
(2.5.35) det A = det A‘, 
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where A = (a;,) = A’ = (axj). Note that the jth column of A‘ has the same 
entries as the jth row of A. In light of this, we have: 


Corollary 2.5.2. In Proposition 2.5.1, one can replace “columns” by “rows.” 


The following is a key property of the determinant. 
Proposition 2.5.3. Given A and B in M(n,F), 


(2.5.36) det(AB) = (det A) (det B). 
Proof. For fixed A, apply Proposition 2.5.1 to 
(2.5.37) 0,(B) = det(AB). 

If B= (b1,...,6n), with jth column b,, then 

(2.5.38) AB = (Aby,..., Abn). 


Clearly rule (a) holds for #1. Also, if B= (bo(1);---;0e(n)) is obtained from B 
by permuting its columns, then AB has columns (Abo(1);-++;Abo(n)), obtained by 
permuting the columns of AB in the same fashion. Hence rule (b) holds for 0. 
Finally, rule (c’) holds for 31, with r = det A, and (2.5.36) follows, 


Corollary 2.5.4. If A€ M(n,F) is invertible, then det A 4 0. 


Proof. If A is invertible, there exists B € M(n,F) such that AB = I. Then, by 
(2.5.36), (det A)(det B) = 1, so det A F 0. 


The converse of Corollary 2.5.4 also holds. Before proving it, it is convenient to 
show that the determinant is invariant under a certain class of column operations, 
given as follows. 


Proposition 2.5.5. Ifa is obtained from A = (a1,...,4n) € M(n,F) by adding 
cag to ay, for some cE F, (Fk, then 
(2.5.39) det A = det A. 


Proof. By rule (a), det A = det A + cdet A®, where A? is obtained from A by 
replacing the column a; by ag. Hence A® has two identical columns, so det A’ = 0, 
and (2.5.39) holds. 


We now extend Corollary 2.5.4. 
Proposition 2.5.6. If A € M(n,F), then A is invertible if and only if det A 4 0. 


Proof. We have half of this from Corollary 2.5.4. To finish, assume A is not 


invertible. As seen in §2.3, this implies the columns aj,...,@, of A are linearly 
dependent. Hence, for some k, 
(2.5.40) ax + D- cea = 0, 

t#k 


with ce € F. Now we can apply Proposition 2.5.5 to obtain det A = det A, where 
A is obtained by adding )* crag to ax. But then the kth column of A is 0, so 
det A = det A = 0. This finishes the proof of Proposition 2.5.6. 
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Further useful facts about determinants arise in the following exercises. 


Exercises 
1. Show that 
1 aig +++) Qin 1 O +. 0 
O agg +++ Gan O aoq +++ Gan 
(2.5.41) det |. : =det |. i : = det Aq, 
0 an2 ents ann 0 An2 ae. ann 


where Aq1 = (Gjk)2<j,K<n- 


Hint. Do the first identity using Proposition 2.5.5. Then exploit uniqueness for 
det on M(n—1,F). 


2. Deduce that det(e;,a2,...,an) = (—1))-! det Ai; where Ajj is formed by delet- 
ing the kth column and the jth row from A. 


3. Deduce from the first sum in (2.5.17) that 

(2.5.42) det A = Cyan det Ajj. 
j=l 

More generally, for any k € {1,...,n}, 

(2.5.43) det A = Were det Ax,;. 
j=l 


This is called an expansion of det A by minors, down the kth column. 


4. Let cx; = (—1)9~* det Ayj. Show that 
(2.5.44) YS ayecej =0, if 04k. 
j=l 


Deduce from this and (2.5.43) that C = (cjx) satisfies 

(2.5.45) CA = (det A)l. 

Hint. Reason as in Exercises 1-3 that the left side of (2.5.44) is equal to 
det (a1,...,@g,---,@e,--+,n); 


with ay in the kth column as well as in the ¢th column. The identity (2.5.45) is 
known as Cramer’s formula. Note how this generalizes (2.5.4). 


108 2. Linear algebra 


5. Show that 
G11 412 *"* Gin 
a22 °°" Gan 
(2.5.46) det ; : = 411422°** Ann: 
ann 


Hint. Use (2.5.41) and induction. Alternative: Use (2.5.27). Show that if a € Sp, 
then o(k) <kVk => o(k) =k. 


The next two exercises deal with the determinant of a linear transformation. Let 
V be an n-dimensional vector space, and 

(2.5.47) T:V—V 

a linear transformation. We would like to define 

(2.5.48) det T = det A, 

where A = M&(T) for some basis S = {v1,...,Un} of V. 


6. Suppose S = {t1,..., 0m} is another basis of V. Show that 
det A = det A, 
where A = M&(T). Hence (2.5.48) defines det T’, independently of the choice of 
basis of V. 
Hint. Use (2.4.13) and (2.5.36). 
7. If also U € L(V), show that 
det(UT) = (det U)(det T). 


Row reduction, matrix products, and Gaussian elimination 


In Exercises 8-13, we consider the following three types of row operations on 


ann x n matrix A = (aj). If o is a permutation of {1,...,n}, let 
(2.5.49) Po(A) = (ao(j)k)- 

If c= (c1,-..,€n), and all c; are nonzero, set 

(2.5.50) Ue(A) = (CF *ajn)- 

Finally, if c € F and p # v, define 

(2.5.51) Epe(A) = (bjn), Ove = Auk — Capk, Djn =Aj~ for j Av. 


Note that a major part of this section dealt with the effect of such row operations 
on the determinant of a matrix. More precisely, they directly dealt with column 
operations, but as remarked after (2.5.35), one has analogues for row operations. 
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We want to relate these operations to left multiplication by matrices P,, Me, 
and Eyyc, defined by the following actions on the standard basis {e1,..., én} of F”, 


(2.5.52) Poe; = €o(j)> Me; = Cjej, 
and 
(2.5.53) Evce€y = ey t+ cey, Eyveej =e; for j Ap. 


These relations are established in the following exercises. 


8. Show that 
(2.5.54) A= Po Po (A), A= Mette(A), A= Enuc€pwe(A)- 


9. Show that P>! = P,-1. 
10. Show that, if uw Av, then Eyye = P>'Eo,-P,, for some permutation o. 


ll. If B = p,(A) and C = p(B), show that A = P,;M.C. Generalize this to 
other cases where a matrix C is obtained from a matrix A via a sequence of row 
operations. 


12. If A is an invertible n x n matrix, with entries in F = R or C (we write 
A€ Gl(n,F)), then the rows of A form a basis of F". Use this to show that A can 
be transformed to the identity matrix via a sequence of row operations. Deduce 
that any A € Gl(n,F) can be written as a finite product of matrices of the form 
Py, Me and Eyre. 


13. Suppose A is an invertible n x n matrix, and a sequence of row operations is 
applied to A, transforming it to the identity matrix J. Show that the same sequence 
of row operations, applied to J, transforms it to A~!. This method of constructing 
A7! is called the method of Gaussian elimination. 


EXAMPLE. We take a 2 x 2 matrix A, write A and I side by side, and perform the 
same sequence of row operations on each of these two matrices, obtaining finally I 
and A~! side by side. 


Hint. Turning around (2.5.54), we have 
(2.5.55) Po(A) = PT'A, pe(A)= Mr'A, ewc(A) = EVLA 


pve" 
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Thus applying a sequence of row operations to A yields 
(2.5.56) Sri. SpA, 


where each S; is of the form (2.5.52) or (2.5.53). If (2.5.56) is the identity matrix, 
then 


(2.5.57) AVS ae Se 


REMARK. The method of Gaussian elimination is computationally superior to the 
use of Cramer’s formula (2.5.45) for computing matrix inverses, though Cramer’s 
formula has theoretical interest. 


A related issue is that, for computing determinants of n x n matrices, for n > 3, 
it is computationally superior to utilize a sequence of column operations, applying 
rules (a) and (b) and Proposition 2.5.5 (and/or the corresponding row operations), 
rather than directly using the formula (2.5.27), which contains n! terms. This 
Gaussian elimination method of calculating det A gives, from (2.5.55)—(2.5.56), 


(2.5.58) det A = (det S1)--- (det Sx), 
with 
(2.5.59) det P, =sgno, detM,=c---cn, det Eye = 1. 


2.6. Eigenvalues and eigenvectors 


Let T: V > V be linear. If there is a nonzero v € V such that 

(2.6.1) Tv = Aju, 

for some A; € F, we say A; is an eigenvalue of 7’, and v is an eigenvector. Let 
E(T,A;) denote the set of vectors v € V such that (2.6.1) holds. It is clear that 
€(T, 2;) is a linear subspace of V and 

(2.6.2) T : E(T,A;) — E(T, Aj). 

The set of A; € F such that €(T, ;) 4 0 is denoted Spec(T). Clearly, A; € Spec(T) 
if and only if J’ — A;J is not injective, so, if V is finite dimensional, 

(2.6.3) Aj € Spec(T) => det (A; J — T) = 0. 

We call Kr(A) = det(AI — T) the characteristic polynomial of T. 


If F = C, we can use the fundamental theorem of algebra, which says every 
non-constant polynomial with complex coefficients has at least one complex root. 
(See Appendix 2.C for a proof of this result.) This proves the following. 


Proposition 2.6.1. If V is a finite dimensional complex vector space and T € 
L(V), then T has at least one eigenvector in V. 


REMARK. If V is real and K7(A) does have a real root A;, then there is a real 
eigenvector in €(T, ;). 
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Sometimes a linear transformation has only one eigenvector, up to a scalar 
multiple. Consider the transformation A: C® — C? given by 


210 
(2.6.4) A=(0 21 
00 2 


We see that det(AI — A) = (A— 2), so \ = 2 is a triple root. It is clear that 
(2.6.5) €(A, 2) = Spanf{e;}, 


where e; = (1,0,0)¢ is the first standard basis vector of C3. 
If one is given T € L(V), it is of interest to know whether V has a basis of 
eigenvectors of TJ’. The following result is useful. 


Proposition 2.6.2. Assume that the characteristic polynomial of T € L(V) has 
k, distinct roots, 4,...,Ax, with eigenvectors v; € E(T,A;), 1 < j < k. Then 


{vi,..., Un} is linearly independent. In particular, if k = dimV, these vectors form 
a basis of V. 
Proof. We argue by contradiction. If {v1,...,v,%} is linearly dependent, take a 


minimal subset that is linearly dependent and (reordering if necessary) say this set 
is {v1,...,Um}, with Tv; = A;v;, and 


(2.6.6) CyUy +--+ +emUm = 0, 
with c; #0 for each j € {1,...,m}. Applying T — AmJ to (2.6.6) gives 
(2.6.7) Cy (Ai 3 Xm)U1 Beas Cm—1(Am—1 = Am)Um—1 = 0, 


a linear dependence relation on the smaller set {v1,...,Um—1}. This contradiction 
gives the proposition. 


Further information on when T € L(V) yields a basis of eigenvectors, and on 
what one can say when it does not, will be given in the following sections. 


——EE ean 
Exercises 


1. Compute the eigenvalues and eigenvectors of each of the following matrices. 
0 1 0 -1 0 1 
1 0)’ L> 60” 0 O}’ 
1 1 1 i it 
0 O}’ i 1}? 0 1)° 


In which cases does C? have a basis of eigenvectors? 
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2. Compute the eigenvalues and eigenvectors of each of the following matrices. 


Or, S10 1d 
1 0 —-2 

—1 2 0 
1 0 1 
0 -l1 0 
1 0 1 


3. Let A € M(n,C). We say A is diagonalizable if and only if there exists an 
invertible B € Mn,C) such that B~'AB is diagonal: 


Ai 
BAB= 
Xn 


Show that A is diagonalizable if and only if C” has a basis of eigenvectors of A. 
Recall from (2.4.13) that the matrices A and B~!AB are said to be similar. 


4. More generally, if V is an n-dimensional complex vector space, we say T' € L(V) 
is diagonalisable if and only if there exists invertible B : C” > V such that B-!TB 
is diagonal, with respect to the standard basis of C". Formulate and establish the 
natural analogue of Exercise 3. 


5. In the setting of (2.6.1)—(2.6.2), given S € L(V, V), show that 
ST=TS = S:€(T,A;) > E(T, Aj). 


2.7. Generalized eigenvectors and the minimal polynomial 


As we have seen, the matrix 
2 1 
(2.7.1) A=|0 2 


has only one eigenvalue, 2, and, up to a scalar multiple, just one eigenvector, €1. 
However, we have 


2.7.2 (A—21)?eg =0, (A—2I)%e3 = 0. 


Generally, if T € L(V), we say a nonzero v € V is a generalized \,-eigenvector if 
here exists & € N such that 


D773 (T —A;1)ku = 0. 


We denote by GE(T, A;) the set of vectors v € V such that (2.7.3) holds, for some 
k. It is clear that GE(T, A;) is a linear subspace of V and 


2.7.4 T : GE(T,d;) —+ GE(T, d;). 


The following property of T on GE(T, A;) will have important consequences. 
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Lemma 2.7.1. For each 4; € F such that GE(T, A;) £0, 
(2.7.5) T — pl : GE(T,A;) —+ GE(T, Aj) is an isomorphism, V p # A;- 
Proof. If T — J is not an isomorphism in (2.7.5), then Tv = pv for some nonzero 


v € GE(T,A;). But then (T — A;J)*v = (w—,;)*v for all k € N, and hence this 
cannot ever be zero, unless pp = A;. 


Note that if V is a finite dimensional complex vector space, then each nonzero 
space appearing in (2.7.4) contains an eigenvector, by Proposition 2.6.1. Clearly 
the corresponding eigenvalue must be ;. In particular, the set of A; for which 
GE(T, A;) is nonzero coincides with Spec(T), as given in (2.6.3). 

We intend to show that if V is a finite dimensional complex vector space and 
T € L(V), then V is spanned by generalized eigenvectors of T. One tool in this 


demonstration will be the construction of polynomials p(X) such that p(T) = 0. 
Here, if 

(2.7.6) P(A) = an XA” + nA" + +A +40, 

then 

(2.7.7) P(T) = an T” + Qn—1T” 1 +++» +a1T + aol. 


Let us denote by P the space of polynomials in 2. 


Lemma 2.7.2. IfV is finite dimensional andT € L(V), then there exists a nonzero 
pé€P such that p(T) =0. 


Proof. If dimV = n, then dim L(V) = n?, so {I,T,... Er} is linearly dependent. 


Let us set 
(2.7.8) Lr ={p € P: p(T) = 0}. 
We see that Z = Tp has the following properties: 


pqeT=—p+aqefl, 


2.7.9 
( ) pet,qeP pq ET. 


A set I C P satisfying (2.7.9) is called an ideal. Here is another construction of a 
class of ideals in P. Given {pi,..., px} C P, set 


(2.7.10) L(pi,.--,Pr) = {pigi +++ + Pde 2 9; € Ph. 


We will find it very useful to know that all nonzero ideals in P, including Zr, have 
the following property. 


Lemma 2.7.3. Let ZC P be a nonzero ideal, and let py € I have minimal degree 
amongst all nonzero elements of ZT. Then 


(2.7.11) LT =I(p,). 
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Proof. Take any p € Z. We divide p;(A) into p(A) and take the remainder, obtain- 
ing 


(2.7.12) p(A) = q(A)p1(A) + 7). 


Here g,r € P, hence r € Z. Also r(A) has degree less than the degree of p,(A), so 
by minimality we have r = 0. This shows p € Z(p,), and we have (2.7.11). 


Applying this to Zr, we denote by mr(A) the polynomial of smallest degree in 
Tr (having leading coefficient 1), and say 


(2.7.13) mr(A) is the minimal polynomial of T. 
Thus every p € P such that p(T) = 0 is a multiple of mp(A). 


Assuming V is a complex vector space of dimension n, we can apply the fun- 
damental theorem of algebra to write 


K 
(2.7.14) mr(d) = [[A->), 

j=l 
with distinct roots \1,...,A«. The following polynomials will also play a role in 
our study of the generalized eigenspaces of T. For each £ € {1,..., A}, set 

d) 

2.71 MSO 
(2.7.15) pe() Ii ad eas Oi: 


We have the following useful result. 


Proposition 2.7.4. If V is an n-dimensional complex vector space andT € L(V), 
then, for each €€ {1,...,K}, 


(2.7.16) GE(T, Ae) = R(pe(T)). 


Proof. Given v € V, 


(2.7.17) (T — ro)" pe(T)v = mp(T)v = 0, 
so p(T) : V + GE(T, Xv). Furthermore, each factor 
(2.7.18) (T — A;)* :GE(T, Ae) + GE(T, re), GA, 


in pe(T) is an isomorphism, by Lemma 2.7.1, so pe(T) : GE(T, Ae) 4 GE(T, Xe) is 
an isomorphism. O 


REMARK. We hence see that each ; appearing in (2.7.14) is an element of SpecT. 


We now establish the following spanning property. 


Proposition 2.7.5. If V is an n-dimensional complex vector space and T € L(V), 
then 


(2.7.19) V =GE(T,A1) +--+ + GE(T, Ak). 
That is, each uv € V can be written as v =v, +-+-+uK, with vj € GE(T, rj). 
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Proof. Let mr(A) be the minimal polynomial of T, with the factorization (2.7.14), 
and define pe(A) as in (2.7.15), for 2=1,...,K. We claim that 


(2.7.20) L(pi,-.-,px) =P. 


In fact we know from Lemma 2.7.3 that Z(pi,...,pK) = Z(po) for some po € P. 
Then any root of po(A) must be a root of each pe(A), 1 < & < K. But these 
polynomials are constructed so that no 4. € C is a root of all K of them. Hence 
po(A) has no root so (again by the fundamental theorem of algebra) it must be 


constant, i.e., 1 € Z(pi,...,pxK), which gives (2.7.20), and in particular we have 
that there exist qe € P such that 
(2.7.21) pi(A)a(A) +--+ + px (A)aK(A) = 1. 


We use this as follows to write an arbitrary v € V as a linear combination of 
generalized eigenvectors. Replacing \ by T' in (2.7.21) gives 


(2.7.22) pi )a(L) +--+ +pr(T)qK(T) = I. 
Hence, for any given v € V, 
(2.7.23) vepi(T)qa(T)v +--+ +pe(Tac(T)v = v1 +--+ +, 


with ve = pe(T)qe(T)u € GE(T, Ae), by Proposition 2.7.4. 


We next produce a basis consisting of generalized eigenvectors. 


Proposition 2.7.6. Under the hypotheses of Proposition 2.7.5, let GE(T, Ac), 1 < 
l< K, denote the generalized eigenspaces of T (with Ax mutually distinct), and let 


(2.7.24) Se = {ve1,..-, Vea}, de = dimGE(T, re), 
be a basis of GE(T, Ae). Then 
(2.7.25) S=S$,U-:-USK 


is a basis of V. 


Proof. It follows from Proposition 2.7.5 that S spans V. We need to show that S is 
linearly independent. To show this it suffices to show that if wg are nonzero elements 
of GE(T, Az), then no nontrivial linear combination can vanish. The demonstration 
of this is just slightly more elaborate than the corresponding argument in Propo- 
sition 2.6.2. If there exist such linearly dependent sets, take one with a minimal 
number of elements, and rearrange {\¢}, to write it as {w1,...,Wm}, so we have 


(2.7.26) Cywy +++ + Emm = 0, 
and c; £0 for each j € {1,...,m}. As seen in Lemma 2.7.1, 
(2.7.27) T — pl : GE(T, Xe) —> GE(T, Ac) is an isomorphism, V p 4 A¢. 


Take k € N so large that (T — Am jk annihilates each element of the basis S,, 
of GE(T, Am), and apply (T’ — AmI)* to (2.7.26). Given (2.7.27), we will obtain 
a nontrivial linear dependence relation involving m — 1 terms, a contradiction, so 
the purported linear dependence relation cannot exist. This proves Proposition 
2.7.6. 
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EXAMPLE. Let us consider A: C? > C°, given by 


2-38 

(2.7.28) A=|0 2 3}. 
O° Oi 
2 


03 9 
(2.7.29 aan (1 0 ). (A—21)?(A— I) =0, 
00 0 


hence ma(A) = (A — 2)?(A — 1). Thus we have 
(2.7.30 pi(A) =A-1,  pa(A) = (A- 2), 


using the ordering Ay = 2, Ax = 1. As for qe(A) such that (2.7.21) holds, a little 
trial and error gives qi(A) = —(A— 3), @(A) = 1, ie., 


(2.7.31 (A-1)(A-3) + (A= 2)? = 1. 
Note that 
13 3 00 6 
(2.7.32 A-I={0 1 3], (A-2n?={0 0 -3]. 
0 0 0 00 1 


Hence, by (2.7.16), 


1 0 6 
(2.7.33) GE(A, 2) = Span (") ‘ (' , GE(A,1) = Span (“) 
0 0 1 


REMARK. In general, for A € M(3,C), there are the following three possibilities. 
(I) A has 3 distinct eigenvalues, \1, A2, 3. Then Aj-eigenvectors vj, 1 <j < 3, 
span C3. 

(II) A has 2 distinct eigenvalues, say 1 (single) and Az (double). Then 


ma(A) =(A—A1)(A—A2)*®, k=1 or 2. 
Whatever the value of k, po(A) = \ — Ay, and hence 
GE(A, 2) = R(A— AD), 
which in turn is the span of the columns of A— A,J. We have 
GE(A, Xo) = E(A, An) SS k= 1. 


In any case, C? = €(A,\1) 6 GE(A, d2). 
(III) <A has a triple eigenvalue, Ay. Then Spec(A — iJ) = {0}, and 


GE(A, a) = C3. 


Compare results of the next section. 
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Exercises 
1. Consider the matrices 
P02 Te *Q)) id 1 2 0O 
Ay = 0 2 0 , Ag a 0 2 0 , A3 = 3 1 3 
-1 0 -1 0 0 1 0 -2 1 


Compute the eigenvalues and eigenvectors of each A;. 


2. Find the minimal polynomial of A; and find a basis of generalized eigenvectors 
of A; . 


3. Consider the transformation D : P2 — P2 given by (2.4.14). Find the eigenvalues 
and eigenvectors of D. Find the minimal polynomial of D and find a basis of P2 
consisting of generalized eigenvectors of D. 


4. Suppose V is a finite dimensional complex vector space and T : V > V. Show 
that V has a basis of eigenvectors of T if and only if all the roots of the minimal 
polynomial m7(A) are simple. 


5. In the setting of (2.7.3)—(2.7.4), given S € L(V), show that 
ST =TS = S:GE(T,A;) > GE(T, Aj). 


6. Show that if V is an n-dimensional complex vector space, S,T € L(V), and ST = 
TS, then V has a basis consisting of vectors that are simultaneously generalized 
eigenvectors of T and of S. 

Hint. Apply Proposition 2.7.6 to 9: GE(T,A;) > GE(T, d;). 


7. Let V be a complex n-dimensional vector space, and take T € L(V), with 
minimal polynomial mr(A), as in (2.7.13). For @€ {1,..., K}, set 

mer (A) 

A= Ag 

Show that, for each @ € {1,..., A}, there exists we € V such that ve = P;(T)we 4 0. 


Then show that (T’— A~I)ve = 0, so one has a proof of Proposition 2.6.1 that does 
not use determinants. 


Pe(A) = 


8. Show that Proposition 2.7.6 refines Proposition 2.7.5 to 


9. Given A, B € M(n,C), define La, Rg : M(n,C) 4 M(n,C) by 
LaX =AX, RepX = XB. 
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Show that if Spec A = {A;}, Spec B = {yx} (= Spec B*), then 
GE(L4,\;) = Span{vw" : v € GE(A, Aj), w € C$, 
GE(Rp, ur) = Span{vw! :v € C",w € GE(B*, ux) }. 
Show that 
GE(La — Rp,o) = Span{vw! : v € GE(A, Aj), w € GE(B", wk), 0 = Aj — pr}. 


10. In the setting of Exercise 9, show that if A is diagonalizable, then GE(L4,A;) = 
€(La,4;). Draw analogous conclusions if also B is diagonalizable. 


11. In the setting of Exercise 9, show that if Spec A = {A;} and Spec B = {px}, 
then 


Spec(La — Rg) = {Aj — pe}. 
Deduce that if C4 : M(n,C) + M(n,C) is defined by 
C,X = AX —XA, 
then 
Spec Ca _ {Aj = Ax}. 


2.8. Triangular matrices 
We say an n X n matrix A = (a;,) is upper triangular if aj, = 0 for 7 > k, and 


strictly upper triangular if aj, = 0 for j > k. Similarly, we have the notion of lower 
triangular and strictly lower triangular matrices. Here are two examples: 


i ee) O42 
(2.8.1) A={0 1 3], B=|0 0 3]; 
00 2 0 0 0 


A is upper triangular and B is strictly upper triangular; A’ is lower triangular and 
B' strictly lower triangular. Note that B? = 0. 

We say T € L(V) is nilpotent provided T* = 0 for some k € N. The following 
is a useful characterization of nilpotent transformations. 


Proposition 2.8.1. Let V be a finite dimensional complex vector space, N € L(V). 
The following are equivalent: 

(2.8.2) N_ is nilpotent, 

(2.8.3) Spec(NV) = {0}, 

(2.8.4) there is a basis of V for which N is strictly upper triangular, 

(2.8.5) 


2.8.5 there is a basis of V for which N is strictly lower triangular. 


Proof. The implications (2.8.4) = (2.8.2) and (2.8.5) => (2.8.2) are easy. Also 
(2.8.4) implies the characteristic polynomial of N is X” (if n = dimV), which is 
equivalent to (2.8.3), and similarly (2.8.5) = (2.8.3). We need to establish a couple 
more implications. 
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To see that (2.8.2) = (2.8.3), note that if N* = 0 we can write 


ae eee em ie 
(2.8.6) (N = pl) (1 N) =--S° Ne, 
up ue ps 


whenever pp # 0. 

Next, given (2.8.3), N : V — V is not an isomorphism, so V; = N(V) has 
dimension < n—1. Now N, = Nly, € £(V) also has only 0 as an eigenvalue, so 
Ni (Vi) = V2 has dimension < n — 2, and so on. Thus N* =0 for sufficiently large 
k. We have (2.8.3) = (2.8.2). Now list these spaces as V = Vp DV, D--- D Ve-1, 
with V._1 #0 but N(Vy_1) = 0. Pick a basis for V,_1, augment it as in Proposition 
2.3.5 to produce a basis for V,_2, and continue, in this fashion obtaining a basis 
of V, with respect to which N is strictly upper triangular. Thus (2.8.3) = (2.8.4). 
On the other hand, if we reverse the order of this basis we have a basis with respect 
to which N is strictly lower triangular, so also (2.8.3) = (2.8.5). The proof of 
Proposition 2.8.1 is complete. 


REMARK. Having proven Proposition 2.8.1, we see another condition equivalent to 
(2.8.2)—(2.8.5): 


(2.8.7) N¥=0, Vk>dimvV. 


EXAMPLE. Consider 


0 2 =O 
(2.8.8) N=1{3 0 3 
0 -2 0 
We have 
6 0 6 
(2.8.9) N?={0 0 0], N°=0. 
-—6 0 -6 
Hence we have a chain V = Vo D V; D V2 as in the proof of Proposition 2.8.1, with 
1 1 0 
V2=Span{ 0], V, =Span oOo}, {1 
—1 -1 0 
(2.8.10) 
1 0 1 
Vo = Span Oo}, {1], [0 = Span{v1, v2, vs}, 
-1 0 0 
and we have 
(2.8.11) Nv, =0, Nvg= v1, Nvz = 3vo, 
so the matrix representation of N with respect to the basis {v1, v2, v3} is 
0 -1 0 
(2.8.12) 0 0 38 


0 0 0 
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Generally, if A is an upper triangular n x n matrix with diagonal entries 
d,,...,dn, the characteristic polynomial of A is 


(2.8.13) det(AI — A) = (A= d;)--+ (A= dn), 
by (2.5.46), so Spec(A) = {dj}. If d1,...,dn are all distinct, it follows that F” has 
a basis of eigenvectors of A. 


We can show that whenever V is a finite dimensional complex vector space and 
T € L(V), then V has a basis with respect to which T is upper triangular. In 
fact, we can say a bit more. Recall what was established in Proposition 2.7.6. If 
Spec(T) = {Ag: 1 < €< K} and S; = {ve1,..., ve,a,} is a basis of GE(T, Xe), then 
S = S)U---USx is a basis of V. Now look more closely at 


(2.8.14) Ty: Ve —+ Ve, Ve=GE(T, 0), Te =Ty,- 


The result (2.7.5) says Spec(T) = {Ac}, ie., Spec(Ty — Ae) = {0}, so we can apply 
Proposition 2.8.1. Thus we can pick a basis $2 of Vp with respect to which Ty — AgI 
is strictly upper triangular, hence in which TJ) takes the form 

re * 
(2.8.15) Ag= ; 

0 re 
Then, with respect to the basis S = S$; U---USx, T has a matrix representation 
A consisting of blocks Ay, given by (2.8.15). It follows that 


K 
(2.8.16) Kr(A) = det(AI—T) = [[(A-), de = dim We. 
é=1 
This matrix representation also makes it clear that K7(T)|y, = 0 for each ¢ € 
{1,..., A} (cf. (2.8.7)), hence 
(2.8.17) Kr(T)=0 on V. 


This result is known as the Cayley-Hamilton theorem. Recalling the characteriza- 
tion of the minimal polynomial m7(A) given in (2.7.11)—(2.7.13), we see that 


(2.8.18) Kr(A) is a polynomial multiple of mr(A). 


Exercises 
1. Consider 
12 0 0 -1 12 3 
A= (5 1) Az={]0 1 Of], Azs=]2 1 2 
10 0 St <2: EL. 


Compute the characteristic polynomial of each A; and verify that these matrices 
satisfy the Cayley-Hamilton theorem, stated in (2.8.17). 
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2. Let P;, denote the space of polynomials of degree < k in x, and consider 
D:Px— Px, Dp(x) = p'(2). 


Show that D*+! = 0 on Py and that {1,2,...,2*} is a basis of Py with respect to 
which D is strictly upper triangular. 


3. Use the identity 


k+1 
(I-D)-'=)°D‘, on Px, 
£=0 
to obtain a solution u € Pz to 
(2.8.19) w—u=a*. 


4. Use the equivalence of (2.8.19) with 


to obtain a formula for 


For an alternative approach, see (1.1.45)—(1.1.52). See also exercises at the end of 
§3.4. 


5. The proof of Proposition 2.8.1 given above includes the chain of implications 
(2.8.4) = (2.8.2) = (2.8.3) = (2.8.4). 
Use Proposition 2.7.4 to show directly that 
(2.8.3) = (2.8.2). 


6. Establish the following variant of Proposition 2.7.4. Let Ar(A) be the charac- 
teristic polynomial of T,, as in (2.8.16), and set 


r 
P(A) = IIo Ay) ~ ae 
JFL 


Show that 
GE(T, Xe) = R(Pe(T)). 


2.9. Inner products and norms 


Vectors in R” have a dot product, given by 


(2.9.1) Vi W= VW, +:++ + UnWn, 
where v = (v1,..-,Un), W = (wi,...,Wn). Then the norm of v, denoted ||v||, is 
given by 


(2.9.2) Jul? =v-v sof +--+ +02. 
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The geometrical significance of ||v|| as the distance of v from the origin is a version 
of the Pythagorean theorem. If v,w € C”, we use 


(2.9.3) (v,w) =v-W= VW + +++ + UnWn, 
and then 
(2.9.4) lull? = (vv) = or?? +-* + fonds 


here, if vj = x; +iy;, with x;,y; € R, we have 0; = x; — iy;, and |v; |? = x + Yj. 
The objects (2.9.1) and (2.9.3) are special cases of inner products. Generally, 


an inner product on a vector space (over F = R or C) assigns to vectors v,w € V 
the quantity (v,w) € F, in a fashion that obeys the following three rules: 


(2.9.5) (av, +agve,w) = ai(vi,w) + ao(v2,w), 
(2.9.6) (v,w) = Twa), 
(2.9.7) (v,v) > 0, unless v = 0. 


If F = R, then (2.9.6) just means (v, w) = (w,v). Note that (2.9.5)—(2.9.6) together 
imply 


(2.9.8) (v, byw, + bow2) = di (v, w1) + bo(u, we). 


A vector space equipped with an inner product is called an inner product space. 
Inner products arise naturally in various contexts. For example, 


b 
(2.9.9) Ge / fla)g(e) de 


defines an inner product on C({a, }]). It also defines an inner product on P, the 
space of polynomials in x. Different choices of a and 6 yield different inner products 
on P. More generally, one considers inner products of the form 


b 
(2.9.10) (fa) =f F(e\aC@ w(0) de, 
a 
on various function spaces, where w is a positive, integrable weight function. 
Given an inner product on V, one says the object ||v|| defined by 
(2.9.11) loll = V@0) 


is the norm on V associated with the inner product. Generally, a norm on V is a 
function v +> |u|] satisfying 


(2.9.12) lav] = |al- lvl, Vaek, ve, 
(2.9.13) |u|] > 0, unless v =0, 
(2.9.14) jv+ul] < lull + |e. 


Here |a| denotes the absolute value of a € F. The property (2.9.14) is called the 
triangle inequality. A vector space equipped with a norm is called a normed vector 
space. 
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If ||v|| is given by (2.9.11), from an inner product satisfying (2.9.5)—(2.9.7), it is 
clear that (2.9.12)—(2.9.13) hold, but (2.9.14) requires a demonstration. Note that 


lv + wl]? = (v + w,v + w) 
(2.9.15) = |lvll? + (vy, w) + (w,v) + lw? 
= |lv||? + 2Re(v, w) + ||wIl?, 


while 

(2.9.16) (fell + fleol])? = [lel]? + lel - [feel] + [eoll?. 

Thus to establish (2.9.14), it suffices to prove the following, known as Cauchy’s 
inequality: 


Proposition 2.9.1. For any inner product on a vector space V, with ||v|| defined 
by (2.9.11), 


(2.9.17) \(v,w)| < |r| Jw], Vu,weV. 


Proof. We start with 
(2.9.18) 0S |v — wl? = llol? — 2Re (v, w) + [ew], 
which implies 
2Re(v,w) < |lul]? + |lwl?, Vo,w eV. 
Replacing v by av for arbitrary a € F of absolute value 1 yields 2Rea(v,w) < 
||u||? + ||w||?.. This implies 


(2.9.19) 2|(v, w)| < lvl]? + |Jwl?, Vo,w eV. 
Replacing v by tv and w by t~!w for arbitrary t € (0,00), we have 
(2.9.20) 2\(v,w)| < t\Jul]? + 477 |Jwl/?, Vv, w © V, t € (0,00). 


If we take ¢? = ||w]|/||v||, we obtain the desired inequality (2.9.17). (This assumes 
v and w are both nonzero, but (2.9.17) is trivial if v or w is 0.) 


There are other norms on vector spaces besides those that are associated with 
inner products. For example, on F”, we have 


(2.9.21) lula — jur| feet lUn|; \|ulloo = ee lux, 


and many others, but we will not dwell on this here. 


If V is a finite dimensional inner product space, a basis {u1,...,Un} of V is 
called an orthonormal basis of V provided 


ie., 
(2.9.23) llujll=1, 9 AK => (uj, ux) =0. 


(When (u;,,uUx) = 0, we say u; and uz are orthogonal.) When (2.9.22) holds, we 
have 


(2.9.24) v =ayuyt-+-+antn, w= byurt::-+bpun > (v,w) = arbi t--:+anbn. 


It is often useful to construct orthonormal bases. The construction we now describe 
is called the Gram-Schmidt construction. 
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Proposition 2.9.2. Let {v1,...,Un} be a basis of V, an inner product space. Then 
there is an orthonormal basis {u1,...,Un} of V such that 


(2.9.25) Span{u; : 7 < €} =Span{vj:j7 <e}, 1<l<n. 
Proof. To begin, take 


1 
= —— U1. 
Ileal 


Now define the linear transformation P, : V + V by Pyv = (v,ui)ui and set 


(2.9.26) uw 


v2 = v2 —- Pyv2 = V2 — (v2, uy)u1. 


We see that (02,u1) = (ve,u1) — (ve,u1) = 0. Also t2 # 0 since u; and ve are 
linearly independent. Hence we set 


1 
2.9.27 U2 = ——— U9. 
I[®all 
Inductively, suppose we have an orthonormal set {ui,...,Um} with m <n and 


2.9.25) holding for 1 < ¢< m. Then define P,, : V > V (the orthogonal projection 
of V onto Span(ui,...,Um)) by 


2.9.28 Pv = (v,u1)ur ++++ + (0, Um)Um, 
and set 
2.9.29) Om41 = Um+41 — PmUm41 = Um41 — (Um41,U1)U1 — +++ — (Um41;Um)Um- 
We see that 
2.9.30 J SM => (Omi, Uj) = (Um41, Uj) — (Um41, Uy) = 0. 
Also, since Um41 ¢ Span{vy,...,Um} = Span{ur,...,Um}, it follows that O41 4 0. 
Hence we set 
(2.9.31) tee a 
ll@m+al 


This completes the construction. 


EXAMPLE. Take V = Po, with basis {1,x,x?}, and inner product given by 
1 

(2.9.32) tv.) =f vle)aCe) ae. 
21 


The Gram-Schmidt construction gives first 


(2.9.33) ui(a) = ve 


Then 
v2 (x) =a, 


since by symmetry (x, ui) = 0. Now nee x? da = 2/3, so we take 


(2.9.34) Uu2(x) = rE 
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Next i 
03(2) = 2? — (a7, u)uy = 2? — 3? 


since by symmetry (x?, u2) = 0. Now 2, @ — 1/3)? dx = 8/45, so we take 


(2.9.35) u3(x) = 2 (2? = ar 


Exercises 


1. Let V be a finite dimensional inner product space, and let W be a linear subspace 
of V. Show that any orthonormal basis {w,,...,w,} of W can be enlarged to an 
orthonormal basis {w,...,w,U1,-.., ue} of V, with k+@=dimV. 


2. As in Exercise 1, let V be a finite dimensional inner product space, and let W 
be a linear subspace of V. Define the orthogonal complement 


(2.9.36) Wt ={veV:(v,w) =0, Vwe Wh. 
Show that 

W+ =Span{ui,..., ue}, 
in the context of Exercise 1. Deduce that 


(2.9.37) (Wt)+=W. 


3. In the context of Exercise 2, show that 


dimV =n, dimW =k dimW+ =n-—k. 


4. Construct an orthonormal basis of the (n — 1)-dimensional vector space 


V1 
v={ : ER" sv +++ +0n =O}. 


Un 
5. Take V = Po, with basis {1,x, x7}, and inner product 


1 
ae | p(x) qa der, 


in contrast to (2.9.32). Construct an orthonormal basis of this inner product space. 


6. Take V, with basis {1, cosa, sin x}, and inner product 


(f.9) = ‘ savas: 


Construct an orthonormal basis of this inner product space. 
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2.10. Norm, trace, and adjoint of a linear transformation 


If V and W are normed linear spaces and T € L(V,W), we define 


(2.10.1) ||| = sup {\|Ze| = lull < 1. 
Equivalently, ||T'|| is the smallest quantity A such that 
(2.10.2) |Tv] < Kljol]l, Vue. 


We call ||T|| the operator norm of T. If V and W are finite dimensional, it can be 
shown that ||T'|| < co for all T € L(V,W). We omit the general argument, but we 


will make some estimates below when V and W are inner product spaces. 
Note that if also S : W — X, another normed vector space, then 

(2.10.3) |STol] < S| |Tv] < SINT [lell, Vue V, 

and hence 

(2.10.4) STI] < [SIZ 

In particular, we have by induction that 

(2.10.5) T:VOV => |T" | < ITI". 


This will be useful when we discuss the exponential of a linear transformation, in 
Chapter 3. 


We turn to the notion of the trace of a transformation T € L(V), given dimV < 
oo. We start with the trace of an n x n matrix, which is simply the sum of the 
diagonal elements, 


(2.10.6) A= (asx) € M(n,F) => TrA =o aj. 
j=l 


Note that if also B = (b;,) € M(n,F), then 
AB=C=(cjk), Cjk = So ajeber. 
e 


(2.10.7) 
BA=D=(djx),  djx = S > bjeaen, 
¢ 
and hence 
2.10.8) TrAB =) — ajeboj = TBA. 


je 

Hence, if B is invertible, 

2.10.9) TrB-'1AB=TrABB!=TrA. 

Thus if T € L(V), we can choose a basis S = {v1,..., Un} of V, if dimV =n, and 
define 

2.10.10) TT=TrA, A=M(T), 

and (2.10.9) implies this is independent of the choice of basis. 


Next we define the adjoint of T € L(V,W), when V and W are finite dimen- 
sional inner product spaces, as the transformation T* € £(W,V) with the property 


(2.10.11) (Tv,w) =(v,T*w), VueV,wew. 
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If {u1,..., Un} is an orthonormal basis of V and {w1,...,Wm} an orthonormal basis 
of W, then 
(2.10.12) A= (aiz), a= (Tv;, wi), 


is the matrix representation of T, as in (2.4.2), and the matrix representation of 
T* is 


2.10.13) A* = (Gji). 


Now we define the Hilbert-Schmidt norm of T € £(V,W) when V and W are 
finite dimensional inner product spaces. Namely, we set 


2.10.14) Ting = Te T*T. 


In terms of the matrix representation (2.10.12) of T, we have 


2.10.15) T°T = (jx), bye = > Gaon, 
L 


hence 


(2.10.16) IIT llzrs = > big = D_ lagal?. 
j j,k 


Equivalently, using an arbitrary orthonormal basis {v1,...,Un} of V, we have 
n 
(2.10.17) IT llins = do MPes?. 
j=l 


Using (2.10.17), we can show that the operator norm of T is dominated by the 
Hilbert-Schmidt norm: 


(2.10.18) IZ < [|Z lls. 


In fact, pick a unit v; € V such that ||Tv|| is maximized on {v : |]u|| < 1}, extend 
this to an orthonormal basis {v1,...,Un}, and use 


n 
ITI? = Teal? < So Tey? = ITllzs- 
j=l 
Also we can dominate each term on the right side of (2.10.17) by ||T'||?, so 
2.10.19 ITllas < VallT||, n=dimV. 
Another consequence of (2.10.17)—(2.10.18) is 
2.10.20 ST lls < |S [Tllas < S|lzs||T las, 


for S as in (2.10.3). In particular, parallel to (2.10.5), we have 


2.10.21 T:VOV = |T" las < IIT II". 
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a a | 
Exercises 


1. Suppose V and W are finite dimensional inner product spaces and T € L(V, W). 
Show that 
ee 


2. In the context of Exercise 1, show that 
T injective <> T* surjective. 
More generally, show that 
N(T) = R(T"*)+. 


(See Exercise 2 of §2.9 for a discussion of the orthogonal complement W +.) 


3. Say Ais ak xn real matrix and the & columns are linearly independent. Show 
that A has k linearly independent rows. (Similarly, treat complex matrices.) 

Hint. The hypothesis is equivalent to A : R* > R” being injective. What does 
that say about A* : R" > R*? 


4. If Aisa k xn real (or complex) matrix, we define the column rank of A to be 
the dimension of the span of the columns of A. We similarly define the row rank of 
A. Show that the row rank of A is equal to its column rank. 

Hint. Reduce this to showing dimR(A) = dimR(A*). Apply Exercise 2 (and 
Exercise 3 of §2.9). 


5. Suppose A is an n x n matrix and ||A|| < 1. Show that 
(I-A) ST +A+t AR +--+ ARH... , 


a convergent infinite series. 


6. If A is an n x n complex matrix, show that 


d € Spec(A) = |A| < ||Al). 


7. Show that, for any real 0, the matrix 
A= (orp ae) 
has operator norm 1. Compute its Hilbert-Schmidt norm. 
8. Given a > b > 0, show that the matrix 
6-9 


has operator norm a. Compute its Hilbert-Schmidt norm. 
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9. Show that if V is an n-dimensional complex inner product space, then, for 
TEL(V), 
det T* = det T. 


10. If V is an n-dimensional inner product space, show that, for T € L(V), 


|7'|| = sup{|(Tu, v)] = |lull, {loll < 1. 


Show that 
7" || = ||7. 


2.11. Self-adjoint and skew-adjoint transformations 


If V is a finite-dimensional inner product space, T € L(V) is said to be self-adjoint 
if T = T* and skew-adjoint if T = —T*. If {uy,..., un} is an orthonormal basis of 
V and A the matrix representation of T with respect to this basis, given by 


(2.11.1) A= (aiz), a= (Tuy, ua), 
then T* is represented by A* = (@;;), so T is self-adjoint if and only if aj; = Gj: 
and T is skew-adjoint if and only if aj; = —aj;. 


The eigenvalues and eigenvectors of these two classes of operators have special 
properties, as we proceed to show. 


Lemma 2.11.1. If A; is an eigenvalue of a self-adjoint T € L(V), then dj is real. 
Proof. Say Tu; = Ajv;, vj #0. Then 
(2.11.2) Ajllog|l? = (Ley, vy) = (vj, Tvj) = Aglleyll?, 


sO rj = dj. 


This allows us to prove the following result for both real and complex vector 
spaces. 


Proposition 2.11.2. If V is a finite dimensional inner product space and T € 


L(V) is self-adjoint, then V has an orthonormal basis of eigenvectors of T. 


Proof. Proposition 2.6.1 (and the comment following it in case F = R) implies 
here is a unit v; € V such that Tv, = A1v1, and we know A; € R. Say dimV = n. 
Let 

2.11.3) W ={weV: (v,w) = Of. 


Then dim W = n — 1, as we can see by completing {v;} to an orthonormal basis of 
V. We claim 


2.11.4) T=T =T:Wow. 
Indeed, 
2.11.5) wewW => (u,Tw) = (Tu, w) = Ai(ui, w) = 0 > Tu € W. 


An inductive argument gives an orthonormal basis of W consisting of eigenvalues 
of T, so Proposition 2.11.2 is proven. 
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The following could be deduced from Proposition 2.11.2, but we prove it di- 
rectly. 


Proposition 2.11.3. Assume T € L(V) is self-adjoint. If Tu; = Ajj, Tun = 
Apve, and r»5; # Ax, then (vj, UK) = 0. 
Proof. Then we have 


Aj (03, Uk) = (Tj, Vk) = (v7, Tvp) = An (vj, Ue): 


If F = C, we have 
(2.11.6) T skew-adjoint << > iT self-adjoint, 
so Proposition 2.11.2 has an extension to skew-adjoint transformations if F = C. 
The case F = R requires further study. 


For concreteness, take V = R", with its standard inner product, and consider 
a skew-adjoint transformation A : R” — R”. In this case, skew-adjointness is 
equivalent to skew-symmetry: 


(2.11.7) A=(ayj), aj; = —aj;. (We say A € Skew(n).) 
Now we can consider 
(2.11.8) A:C”—>C’, 


given by the same matrix as in (2.11.7), which is a matrix with real entries. Thus 
the characteristic polynomial K4(\) = det(AI — A) is a polynomial of degree n 
with real coefficients, so its non-real roots occur in complex conjugate pairs. Thus 
the nonzero elements of Spec(A) are 


(2.11.9) Spec’(A) = {iA1,...,iAm,—tA1,---, Am}, 

with A; A Ax if 7 A k; for the sake of concreteness, say each A; > 0. By Proposition 
2.11.2, C” has an orthonormal basis of eigenvalues of A, and of course each such 
basis element belongs to €(A,iA,;) or to E(A, —iA,;), for some j € {1,...,m}, or to 
E€(A,0) = N(A). For each j € {1,...,m}, let 


2.11.10 {Uj1, ++, U;,4;} 

be an orthonormal basis of €(A,iA,;). Say 

2.11.11 vjk = Eje tinge, Eze, Tk € R”. 

Then we can take 

2.11.12 Vik = jk — inje € C”, 

and 

2.11.13 {Gj1,---,j,a;} 

is an orthonormal basis of €(A,—iA,;). Note that 

2.11.14 AGjn = Agnes Anjr = AjSjx. 1S < dj. 
Note also that 

2.11.15 Spang {Ejx,njk 2 1<k < dj} = €(A,iA;) + E(A, —7A;), 
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while we can also take 
2.11.16 Spanp{éjxn, jn: 1<k< dj} =H(A,A;) CR", 


a linear subspace of R”, of dimension 2d;. Furthermore, applying Proposition 2.11.3 
to iA, we see that 


2.11.17 (vse, Dye) = 0 => [l&sall? = Ilngell?, and (Ese, myx) = 0, 
hence 

sb 
2.11.18 lésxll = Ilngell = 5. 


S 


Aaking further use of 


2.11.19 (viz, Uke) =0, (vij, Uke) = dik dje, 
we see that 
2.11.20 {V2Ej4,V2nje 1 Sk <dj1<j<m} 


is an orthonormal set in R”, whose linear span over C coincides with the span of 
all the nonzero eigenspaces of A in C”. 


Next we compare Nc(A) C C” with Ng(A) C R”. It is clear that, if vj = 
&j + ing, €j.17 € R”, 
(2.11.21) U5 Nc(A) Ej Nj Np(A), 


since A is areal matrix. Thus, if {€1,...,€,} is an orthonormal basis for Ng(A), it 
is also an orthonormal basis for Nc(A). Therefore we have the following conclusion: 


Proposition 2.11.4. [f A: R” > R” is skew-adjoint, then R” has an orthonormal 
basis in which the matrix representation of A consists of blocks 


OG» 26 
(2:17,99) ee ) 


plus perhaps a zero matrix, when N(A) 40. 


Exercises 
1. Verify Proposition 2.11.2 for V = R? and 
101 
T=|]0 1 0 
101 
2. Verify Proposition 2.11.4 for 
0 -l 2 
A={1 0 -8 
=2, 3. 0 


3. In the setting of Proposition 2.11.2, suppose S,T € L(V) are both self-adjoint 
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and suppose they commute, i.e., ST = TS. Show that V has an orthonormal basis 
of vectors that are simultaneously eigenvectors of S and of T. 


4. If V is a finite dimensional inner product space, we say T € L(V) is positive 
definite if and only if T = T* and 


(2.11.23) (Tv,v) >0 for all nonzero ve V. 


Show that T € L(V) is positive definite if and only if T = T* and all its eigenvalues 
are > 0. We say T is positive semidefinite if and only if T = 7* and 


(Tv,v) >0, VueV. 


Show that T € L(V) is positive semidefinite if and only if T = T* and all its 
eigenvalues are > 0. 


5. If T € L(V) is positive semidefinite, show that 
||T|| = max{A : A € Spec T}. 


6. If S € L(V), show that $*S' is positive semidefinite, and 
I|S\|? = |S*S]]. 


2.12. Unitary and orthogonal transformations 


Let V be a finite-dimensional inner product space (over F) and T € L(V). Suppose 
2.12.1) T1=T*. 
If F = C we say T is unitary, and if F = R we say T is orthogonal. We denote by 


U(n) the set of unitary transformations on C” and by O(n) the set of orthogonal 
ransformations on R". Note that (2.12.1) implies 


2.12.2) | det T|? = (det T)(det T*) = 1, 
i.e., det T € F has absolute value 1. In particular, 
2.12.3) T € O(n) det T = +1. 
We set 


SO(n) = {T € O(n) : det T = 1}, 
SU(n) = {T € U(n) : det T = 1}. 


2.12.4) 


As with self-adjoint and skew-adjoint transformations, the eigenvalues and 
eigenvectors of unitary transformations have special properties, as we now demon- 
strate. 


Lemma 2.12.1. [f A; is an eigenvalue of a unitary T € L(V), then |A;| = 1. 
Proof. Say Tu; = Ajv;, vj #0. Then 
(2.12.5) logl? = (P"T 04,03) = (oj, Tey) = DyP lle. 
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Next, parallel to Proposition 2.11.2, we show unitary transformations have 
eigenvectors forming a basis. 


Proposition 2.12.2. If V is a finite dimensional complex inner product space and 
TEL(V) ts unitary, then V has an orthonormal basis of eigenvectors of T. 


Proof. Proposition 6.1 implies there is a unit v; € V such that Tv, = A1v1. Say 
dim V = n. Let 


(2.12.6) W ={weV: (v,w) = Of. 

As in the analysis of (2.11.3) we have dimW =n-— 1. We claim 
(2.12.7) T unitary = T:W > W. 

Indeed, 


(2.12.8) weWw = (w,Tw) = (To, w) = Ap (v1, w) =0 > Tw € W. 


Now, as in Proposition 2.11.2, an inductive argument gives an orthonormal basis 
of W consisting of eigenvectors of T, so Proposition 2.12.2 is proven. 


Next we have a result parallel to Proposition 2.11.3. 
Proposition 2.12.3. Assume T € L(V) is unitary. If Tv; = r;v; and Tux, = 
Ave, and 5 # Ax, then (vj, UK) = 0. 
Proof. Then we have 


dj (vj, Vk) = (Tj, ve) = (vj, T~*ve) = Ag (v;, Ue), 


Nig el 
since A, = Ax. 


We next examine the structure of orthogonal transformations, in a fashion 
parallel to our study in §2.11 of skew-adjoint transformations on R”. Thus let 


(2.12.9) A:R” — R” 

be orthogonal, so 

(2.12.10) AA* =T, 

which for real matrices is equivalent to 4A’ = I. Now we can consider 
A:C”—>C", 


given by the same matrix as in (2.12.9), a matrix with real entries. Thus the 
characteristic polynomial K4(A) = det(AI — A) is a polynomial of degree n with 
real coefficients, so its non-real roots occur in complex conjugate pairs. Thus the 
elements of Spec(A) other than +1 are given by 


(2.12.11) Spec* (A) = {w1,...,Wm,@1,..-; Wm}, oj = wi", 


with the various listed eigenvalues mutually distinct. For the sake of concreteness, 
say Imw, > 0 for each j € {1,...,m}. By Proposition 2.12.2, C” has an orthonor- 
mal basis of eigenvectors of A, and of course each such basis element belongs to 
E(A,w;), or to E(A,W;), for some j € {1,...,m}, or to E(A,1) or E(A,—1). For 
each j € {1,...,m}, let 


(2.12.12) {vji,.-- ,U;,d; } 
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be an orthonormal basis of €(A,w;). Say 


2.12.13 Vik = Ein + Inge, Exes Nik € R”. 
Then we can take 

2.12.14 Dik = jk — inje € C”, 

and 

2.12.15 {Uj1,.--, 03,4; } 

is an orthonormal basis of €(A,@,;). Writing 

2.12.16 Ww; =e; +is;, cj,8;ER, 
we have 


Ag jx = CiEjk — 85 Nik; 


2.12.17 
Ange = 8585 + CiNjk, 


for 1 <k <d,. Note that 


2.12.18 Spane {n,n : 1 <k < dj} = E(A,wj) + E(A,G;), 
while we can also take 
2.12.19 Spang {&jx,njr i: 1<k <dj} =H(A,w;) CR", 


a linear subspace of R”, of dimension 2d;. 
Parallel to the arguments involving (2.11.17)—(2.11.20), we have that 
i 1 
{aim var 
is an orthonormal set in R”, whose linear span over C coincides with the span of 
all the eigenspaces of A with eigenvalues 4 +1, in C”. 


(2.12.20) L<k<dj,1<j<mh 


We have the following conclusion. 


Proposition 2.12.4. If A: R" — R” is orthogonal, then R” has an orthonormal 
basis in which the matrix representation of A consists of blocks 


Cj $j 2 9) 5 
(2.12.21) oS 2) ; G + 85 = ae 


plus perhaps an identity matrix block, if E(A,1) 4 0, and a block that is —I, if 
E(A,-1) £0. 


EXAMPLE 1. Picking c,s € R such that c? + s? = 1, we see that 


p= 2) 


is orthogonal, with det B = —1. Note that Spec(B) = {1,—1}. Thus there is an 
orthonormal basis of R? in which the matrix representation of B is 


(0 “)- 
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EXAMPLE 2. If A: R® — R® is orthogonal, then there is an orthonormal basis of 
R? in which 


(2.12.22) A=|s c or 5’ ¢ , 


depending on whether det A = 1 or det A = —1. (Note we have switched signs 
on s, which is harmless. This lines our notation up with that used in §3.2.) Since 
c? +s? = 1, it follows that there is an angle 6, uniquely determined up to an additive 
multiple of 27, such that 


(2.12.23) c=cosé, s=sindg. 


(See §1.1, and also §3.2.) If det A = 1 in (2.12.22), we say A is a rotation about the 
axis u3, through an angle 0. 


eC 
Exercises 


1. Let V be a real inner product space. Consider nonzero vectors u,v € V. Show 
that the angle 0 between these vectors is uniquely defined by the formula 


(u,v) = [lull |lvl| cos8, 0< 0 <r. 
Show that 0 < @ < 7 if and only if u and v are linearly independent. Show that 
||u + vl]? = lull? + loll? + 2Ilul| - [lel] cos 4. 


This identity is known as the law of cosines. 


For V as above, u,v,w € V, one defines the angle between the line segment from 
w to u and the line segment from w to v to be the angle between u— w and v — w. 
(We assume w £ u and w # v.) 


2. Take V = R?, with its standard orthonormal basis i = (1,0), 7 = (0,1). Let 
u=(1,0), v=(cosy,siny), O0<y< 2z. 
Show that, according to the definition of Exercise 1, the angle 0 between u and v 
is given by 
0= 9, if O<y<rt, 
2r—-y, if tS yp< 2x. 


3. Let V be a real inner product space and let R € L(V) be orthogonal. Show that 
if u,v € V are nonzero and u = Ru, v = Rv, then the angle between wu and v is 
equal to the angle between i and 0. Show that if {e;} is an orthonormal basis of 
V, there exists an orthogonal transformation R on V such that Ru = |lulje; and 
Rv is in the linear span of e; and e2. 
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Figure 2.12.1. Law of sines 


4. Consider a triangle as in Figure 2.12.1. Show that 
h=csin A, 
and also 
h=asinC. 
Use these calculations to show that 
snA sinC _ sinB 


a c b 
This identity is known as the law of sines. 


Exercises on cross products 


Exercises 5-8 deal with cross products of vectors in R°. 


5. If u,v € R°, show that the formula 

Wi UL U1 
(2.12.24) w:(uxv)=det | we ug v2 

W3 UZ U3 
for u x v = II(u,v) defines uniquely a bilinear map II : R? x R? > R°. Show that 
it satisfies 

EX GS ho GOOK Sty BRS 95 

where {i, j,k} is the standard basis of R°. 
Note. To say II is bilinear is to say I[(u, v) is linear in both u and v. 


6. Recall that T € SO(3) provided that T is a real 3 x 3 matrix satisfying T’T = I 
and det T > 0, (hence det T = 1). Show that 


(2.12.25) T € SO(3) = Tux Tv = T(u x v). 
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Hint. Multiply the 3 x 3 matrix in Exercise 5 on the left by T. 


7. Show that, if @ is the angle between u and v in R°, then 
(2.12.26) lu x v|| = |e] - [lol] - | sin 6]. 


Hint. Check (2.12.26) for u = i, v = ai + bj, and use Exercise 6 to show this 
suffices. 


8. More generally, show that for all u,v,w,x € R°, 


(2.12.27) (ux v)-(w x x) = det C evans ) 


Uk V-a& 
Hint. Using Exercise 6, show that it suffices to check this for 

w=it, cx=ait+bj, so wxx=Dbk. 
Then the left side of (2.12.27) is equal to 


Ow 
(uxv)-bk =det}O u- 
bu: 


= baer (3s a) 
Urq U's 


or wet ved 
me u-(ait+bj) v-(ait+bj))? 
which is equal to the right side of (2.12.27). 


ros 


a Ue 
jou: 
k ov: 


9. Show that « : R° + Skew(3), the set of antisymmetric real 3 x 3 matrices, given 
by 


0 ys Y2 Yt 
2.12.28 K(y) =| ys 0 -ym |, y=] yu], 
—y2 Yl 0 Y3 
satisfies 
2.12.29 K(y)e =yX @. 


Show that, with [A,B] = AB — BA, 


2.12.30 K(x x y) = [K(x), «(y)], 
a 


Tr («(x)K(y)") = 20 


10. Demonstrate the following result, which contains both Propositions 2.11.2 and 
2.12.2. Let V be a finite dimensional inner product space. We say T': V — V is 
normal provided T and T* commute, i.e., 


(2.12.31) TT* =T"T. 


Proposition. [f V is a finite dimensional complex inner product space and T € 
L(V) is normal, then V has an orthonormal basis of eigenvectors of T. 
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Hint. Write T = A+iB, A and B self-adjoint. Then (2.12.31) = AB = BA. 
Apply Exercise 3 of §2.11. 


2.A. The Jordan canonical form 


Let V be an n-dimensional complex vector space, and suppose T.: V + V. The 
following result gives the Jordan canonical form for T. 


Proposition 2.A.1. There is a basis of V with respect to which T is represented 
as a direct sum of blocks of the form 
Aj 1 
(2.A.1) a 
1 
Xj 


In light of Proposition 2.7.6 on generalized eigenspaces, together with Proposi- 
tion 2.8.1 characterizing nilpotent operators and the discussion around (2.8.14), to 
prove Proposition 2.A.1 it suffices to establish such a Jordan canonical form for a 
nilpotent transformation N : V > V. (Then A; = 0.) We turn to this task. 

Given vo € V, let m be the smallest integer such that N™vup) = 0; m < n. 
If m = n, then {vo, Nuvo, ..., N™-1v9} gives a basis of V putting N in Jordan 
canonical form, with one block of the form (2.A.1) (with A; = 0). In any case, we 
call {vo,..., N™-1y9} a string. To obtain a Jordan canonical form for N, it will 
suffice to find a basis of V consisting of a family of strings. We will establish that 
this can be done by induction on dim V. It is clear for dim V < 1. 

So, given a nilpotent N : V > V, we can assume inductively that V; = N(V) 
has a basis that is a union of strings: 


(2.A.2) {u;,Nvj,-..,N%vj}, 1<j<d. 

Furthermore, each v; has the form v; = Nw, for some w; € V. Hence we have the 
following strings in V: 

(2.4.3) {w;,vj = Nw;,Nv;,...,N%u;}, 1<j<d. 

Note that the vectors in (2.4.3) are linearly independent. To see this, apply N to 
a linear combination and invoke the independence of the vectors in (2.4.2). 


Now, pick a set {(1,...,¢,} C V which, together with the vectors in (2.A.3) 
form a basis of V. Then each N¢; can be written N¢; = NG for some Ci in the 
linear span of the vectors in (2.4.3), so 


(2.A.4) a= waa -G 


also together with (2.4.3) forms a basis of V, and furthermore z; € N(.N). Hence 
the strings 


(2.4.5) {wj,vj,---,N%uj}, 1<g<d, {},...,fa}, 


provide a basis of V, giving N its Jordan canonical form. 
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There is some choice in producing bases putting T € £(V) in block form. So 
we ask, in what sense is the Jordan form canonical? The answer is that the sizes 
of the various blocks are independent of the choices made. To show this, again it 
suffices to consider the case of a nilpotent N : V > V. Let 6(k) denote the number 
of blocks of size k x k in a Jordan decomposition of N, and let 6 = >>, B(k) denote 
the total number of blocks. Note that dim N(N) = 6. Also dim N(N7?) exceeds 
dim V(N) by 8 — 8(1). In fact, generally, 


dim N(N) = 8, 
dim. N(N?) = dimN(N) + 8 — 6(1) 
(2.A.6) 
dim NV (N**++) = dim.NV(N*) + 6 — B(1) —---— B(R). 


These identities specify G and then inductively each 6(k) in terms of dim. N(N%), 1 < 
j<k4+1. 


2.B. Schur’s upper triangular representation 


Let V be an n-dimensional complex vector space, equipped with an inner product, 
and let T € L(V). The following is an important alternative to Proposition 2.A.1. 


Proposition 2.B.1. There is an orthonormal basis of V with respect to which T 
has an upper triangular form. 


Note that an upper triangular form with respect to some basis was achieved in 
(2.8.15), but there the basis was not guaranteed to be orthonormal. We will obtain 
Proposition 2.B.1 as a consequence of the following. 


Proposition 2.B.2. There is a sequence of vector spaces V; of dimension j such 
that 


(2.B.1) V=V,2>Vn-1D°':'DV 
and 
(2.B.2) T:V; > Vj. 


We show how Proposition 2.B.2 implies Proposition 2.B.1. In fact, given 
(2.B.1)—(2.B.2), pick u, - V,-1, a unit vector, then pick a unit uz; € Vn—1 
such that u,-1 | V,—2, and so forth, to achieve the conclusion of Proposition 
2.B.1. 


Meanwhile, Proposition 2.B.2 is a simple inductive consequence of the following 
result. 


Lemma 2.B.3. Given T € L(V) as above, there is a linear subspace V,-1, of 
dimension n — 1, such that T : Vz_1 7 Vn-1.- 


Proof. We apply Proposition 2.6.1 to T* to obtain a nonzero v; € V such that 
T*v, = Av1, for some X € C. Then the conclusion of Lemma 2.B.3 holds with 
Vn—1 = (v,)+. 
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2.C. The fundamental theorem of algebra 


The following result is known as the fundamental theorem of algebra. It played a 
crucial role in §2.6, to guarantee the existence of eigenvalues of a complex n x n 
matrix. 


Theorem 2.C.1. If p(z) is a nonconstant polynomial (with complex coefficients), 
then p(z) must have a complex root. 


Proof. We have, for some n > 1, a, #0, 
p(z) = Gnz" +--+ +412 +49 


2.C.1 
( ) =4,2"(1+ R(z)),  |z| 4 00, 
where 

Cc 

|R(z)| < ie’ for |z| large. 

z 
This implies 
(2.C.2) lim |p(z)| = oo. 

|z| 00 

Picking R € (0,00) such that 
2.0.3 inf > |p(0)|, 
(2.C.3) oul, he) e)| 
we deduce that 
(2.C.4) eee lp(=)| = inf |p(2)|- 


Since Dr = {z : |z| < R} is closed and bounded and p is continuous, there exists 
zo € Dr such that 


(2.C.5) |p(Z0)| = inf \p(z)I.- 


(For further discussion of this point, see Appendix 4.B.) The following lemma thus 
completes the proof. 


Lemma 2.C.2. If p(z) is a nonconstant polynomial and (2.C.5) holds, then p(zo) = 
0. 


Proof. Suppose to the contrary that 


(2.C.6) p(zo) =a #0. 
We can write 
(2.C.7) p(zo + 6) =at+ (6), 


where q(¢) is a (nonconstant) polynomial in ¢, satisfying q(0) = 0. Hence, for some 
k >1 and b 40, we have q(¢) = b¢* +--+» + bn”, ie., 


(2.C.8) g(¢) =F +6***r(C), IOI SC, 69, 
so, with ¢ = ew, w€ St = {w: |w| = 1}, 


(2.C.9) p(zo + ew) = a+ bwhe® + (ew)*t4r(ew), e\0. 
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Pick w € S' such that 


(2.C.10) ee ee 


—wk =, 
[0 |a| 
which is possible since a 4 0 and b #0. Then 
b 
(2.0.11) p(zo + ew) = a(1 - |=|e*) + (ew)*t1r(ew), 


with r(¢) as in (2.C.8), which contradicts (2.C.5) for « > 0 small enough. Thus 
(2.C.6) is impossible. This proves Lemma 2.C.2, hence Theorem 2.C.1. 


Now that we have shown that p(z) in (2.C.1) must have one root, we can show 
it has n roots (counting multiplicity). 


Proposition 2.C.3. For a polynomial p(z) of degree n, as in (2.C.1), there exist 
T1,--+,T € C such that 


(2.0.12) p(z) = Gn(z — 171) ++: (2 —Tn). 


Proof. We have shown that p(z) has one root; call it r;. Dividing p(z) by z— 11, 
we have 

(2.C.13) p(z) = (2-11) p(z) +4, 

where £(z) = an2"-!+++-+4@ 9 and q is a polynomial of degree < 1, i.e., a constant. 
Setting z =r, in (2.C.13) yields gq = 0, ie., 

(2.0.14) p(z) = (2 — 11) p(z). 

Since p(z) is a polynomial of degree n — 1, the result (2.C.12) follows by induction 
on n. 


REMARK 1. The numbers rj, 1 <j < n, in (2.C.12) are the roots of p(z). If k of 
them coincide (say with rz), we say r¢ is a root of multiplicity k. If re is distinct 
from r; for all 7 4 @, we say re is a simple root. 


REMARK 2. In complex analysis texts, like [4] and [47], one can find proofs of the 
fundamental theorem of algebra that are even shorter than the one given above, 
but that use more advanced techniques. 


tl 
Chapter 3 


Linear systems of differential 
equations 


This chapter connects the linear algebra developed in Chapter 2 with Differential 
Equations. We define the matrix exponential in $3.1 and show how it produces the 
solution to first order systems of differential equations with constant coefficients. 
We show how the use of eigenvectors and generalized eigenvectors helps to compute 
matrix exponentials. In $3.2 we look again at connections between exponential and 
trigonometric functions, complementing results of Chapter 1, §1.1. 


In §3.3 we discuss how to reduce a higher order differential equation to a first 
order system, and show how the “companion matrix” of a polynomial arises in 
doing this. We show in §3.4 how the matrix exponential allows us to write down 
an integral formula (Duhamel’s formula) for the solution to a non-homogeneous 
first order system, and illustrate how this in concert with the reduction process 
just mentioned, allows us to write down the solution to a non-homogeneous second 
order differential equation. 


Section 3.5 discusses how to derive first order systems describing the behavior 
of simple circuits, consisting of resistors, inductors, and capacitors. Here we treat 
a more general class of circuits than done in Chapter 1, §1.13. 


Section 3.6 deals with second order systems. While it is the case that second 
order n x n systems can always be converted into first order (2n) x (2n) systems, 
many such systems have special structure, worthy of separate study. Material on 
self adjoint transformations from Chapter 2 plays an important role in this section. 


In 83.7 we discuss the Frenet-Serret equations, for a curve in three-dimensional 
Euclidean space. These equations involve the curvature and torsion of a curve, 
and also a frame field along the curve, called the Frenet frame, which forms an 
orthonormal basis of R* at each point on the curve. Regarding these equations as 
a system of differential equations, we discuss the problem of finding a curve with 
given curvature and torsion. Doing this brings in a number of topics from the 
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previous sections, and from Chapter 2, such as the use of properties of orthogonal 
matrices. 


Having introduced equations with variable coefficients in §3.7, we concentrate 
on their treatment in subsequent sections. In §3.8 we study the solution operator 
S(t,s) to a homogeneous system, show how it extends the notion of matrix expo- 
nential, and extend Duhamel’s formula to the variable coefficient setting. In §3.9 
we show how the method of variation of parameters, introduced in Chapter 1, ties 
in with and becomes a special case of Duhamel’s formula. 


Section 3.10 treats power series expansions for a first order linear system with 
analytic coefficients, and §3.11 extends the study to equations with regular singular 
points. These sections provide a systematic treatment of material touched on in 
Chapter 1, §1.15. In these sections we use elementary power series techniques. 
Additional insight can be gleaned from the theory of functions of a complex variable. 
Readers who have seen some complex variable theory can consult [4], pp. 299-312, 
[19], pp. 70-83, or [47], Chapter 7, for material on this. 


Appendix 3.A treats logarithms of matrices, a construction inverse to the ma- 
trix exponential introduced in §3.1, establishing results that are of use in §§3.8 
and 3.11. Building on material from §1.18 of Chapter 1, Appendix 3.B develops 
the Laplace transform in the matrix setting, as a tool for solving nonhomogeneous 
linear systems. It also draws a connection between this method and Duhamel’s for- 
mula. Appendix 3.C provides a brief introduction to the class of complex analytic 
functions, whose relevance for power series techniques in ODE was touched on in 


§3.10. 


3.1. The matrix exponential 
Here we discuss a key concept in matrix analysis, the matrix exponential. Given 


AéM(n,F), F =R or C, we define e4 by the same power series used in Chapter 
1 to define e4 for A € R: 


co 1 
A_ k 
(3.1.1) eS y ke : 
k=1 


Note that A can be a real or complex n x n matrix. In either case, recall from §2.10 
of Chapter 2 that ||A*|| < ||A||*. Hence the standard ratio test implies (3.1.1) is 
absolutely convergent for each A € M(n,F). Hence 

(3.1.2) 4 =\°— Ab 


is a convergent power series in t, for all t € R (indeed for t € C). As for all such 
convergent power series, we can differentiate term by term. We have 


(3.1.3) 
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We can factor A out on either the left or the right, obtaining 
d 


(3.1.4) a = e4A4 = AeA, 
Hence x(t) = eA solves the first-order system 

dx 
(3.1.5) At 2(0) = 20. 


This is the unique solution to (3.1.5). To see this, let x(t) be any solution to (3.1.5), 
and consider 


3.1.6 u(t) = e'4x(t). 
Then u(0) = x(0) = zp and 
d 
3.1.7 ul) =e A(t) + ea’ (t) = 0, 
so u(t) = u(0) = zo. The same argument yields 
3.1.8 £ (ete) =0, hence e4e“4 = I, 


Hence x(t) = e'4x9, as asserted. 
Using a variant of the computation (3.1.7), we show that the matrix exponential 


has the following property, which generalizes the identity e*t' = e*e! for real s,t, 
established in Chapter 1. 


Proposition 3.1.1. Given Ae M(n,C), s,t ER, 
(3.1.9) est A — psAgta, 


Proof. Using the Leibniz formula for the derivative of a product, plus (3.1.4), we 
have 


3.1.10) & (let 4er*4) = lst) fet _ olstt)A get = 9), 


Hence e(*+4e-*4 ig independent of t, so 


3.1.11) estHAeta — 684 VS LER. 


Taking s = 0 yields e as we have already seen in (3.1.8)) or e~'4 = 


e'4)-!. so we can multiply both sides of (3.1.11) on the right by e’4 and obtain 
3.1.9). 


tAe-ta mf ( 


Now, generally, for A,B € M(n,F), 


3.1.12) eeB f eAtB, 


However, we do have the following. 
Proposition 3.1.2. Given A,B € M(n,C), 
(3.1.13) AB = BA = c4+8 = &4c®. 
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Proof. We compute 


(3.1.14) 
d t(A+B),-tB,-tA 
AG ee ) 

= el AFB) 4 B)e~*Be-t4 _ el(AtB) Be-tBe-tA _ et(At+B) -tB ye-ta, 


Now AB = BA= AB = B*A, hence 


Oe a ee 
(3.1.15) € ee ar A= Ac, 
=0 


so (3.1.14) vanishes. Hence e*(4+5)e—tB tA 


(3.1.16) eM AtB) e—tBe-ta _ 7 


is independent of t, so 


the value at t = 0. Multiplying through on the right by e’4 and then by e’? gives 
(3.1.17) el(ATB) — ptAgtB 


Setting t = 1 gives (3.1.13). 


We now look at examples of matrix exponentials. We start with some compu- 
tations via the infinite series (3.1.2). Take 


1 0 0 1 
3.1.18 A=(5 as B=(6 ae 


Then 

~_{1 0 ao ae 
3.1.19 A =(5 gk } > B= B= =0, 
so 


t 
tA _ €. 0 tB 1 t 
3.1.20 eo = i 2) » ev= i :) : 


Note that A and B do not commute, and neither do e'4 and e!®, for general t 4 0. 
On the other hand, if we take 


11 
(3.1.21) C= € i) =J+-B, 
since J and B commute, we have without further effort that 
ete! 
(3.1.22) fC = el et? = G S) . 


We turn to constructions of matrix exponentials via use of eigenvalues and 
eigenvectors. Suppose vj; is an eigenvector of A with eigenvalue Aj, 


(3.1.23) Av; = Ajv;. 


Then Aky; = Meou;, and hence 


(3.1.24) e4y, = s a Aku; = > : Mu, = ejv, 
ole j k! Jj kl 4 U5 in 
k=0 k=0 
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This enables us to construct e'4v for each v € C” if A € M(n,C) and C” has a basis 
of eigenvectors, {v; : 1 <j <n}. In such a case, write v as a linear combination of 
the eigenvectors, 


(3.1.25) V = CyV, +++ + EnUn, 
and then 
e4y= cet y, ee Cn Un, 


bay 


(3.1.26) ey 
= ce Vy Hes + ene "Un. 


We illustrate this process with some examples. 


EXAMPLE 1. Take 


0 1 
3.1.27 A= c a) ; 
One has det(AI — A) = \? — 1, hence eigenvalues 
3.1.28 a Sele ees 


with corresponding eigenvectors 


1 1 
3.1.29 VW= GE V2 = e 


Hence 


3.1.30 eAy, =e'v, efAy, =e 'v. 


To write out e’4 as a 2 x 2 matrix, note that the first and second columns of this 
matrix are given respectively by 


(3.1.31) re (3) and e!4 ee 


To compute this, we write (1,0)’ and (0, 1)* as linear combinations of the eigenvec- 


tors. We have 
am Q=104C) O1040) 


Hence 
sar ail 
(3.1.33) => 6 


and similarly 


ay on) = ($629) 
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Recalling that 


ee: t_o-t 
(3.1.35) cosht = = = , sinht= Se 
we have 

14 _ (cosht sinht 
(22180) cane a an : 
EXAMPLE 2. Take 
0 -2 
3.1.37 A= (; 9 ) : 
One has det(AI — A) = \? — 24 + 2, hence eigenvalues 
3.1.38 Cer eae ae en ee 


with corresponding eigenvectors 


3.1.39 ot Gas, 7 i 
ie As, Uy= 14a)? v2 = pax é 


We have 
3.1.40 ey, = ett)ty, Ay, =a et-Hty, 


We can write 


(3.1.41) (0) = as ac fe) . 7 Re, 
(1)= A Aye) 


to obtain 
tA 1 _t — 1 e(itie et 4 i 1 aot ee 
0 1+i7 4 1-i 
(3.1.42) : ae 
- ef S +2)e% + (2—2)e* 
~ 4 —2ie’t + Qie~* ; 
and 
0 a F —2 i ; 2 
tA aoe pte a 2 ,{(1-i)t 
‘ (1) os ae = GD) 
(3.1.43) 


_ et Jie — ie 

~ 2\1-det#+(1t+ie#)' 

We can write these in terms of trigonometric functions, using the fundamental Euler 
identities 

(3.1.44) e' =cott+isint, e  =cost—isint, 


established in §1.1 of Chapter 1. (See §3.2 of this chapter for more on this.) These 
yield 


(3.1.45) cost = 5 , sint= ; 
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and an inspection of the formulas above gives 


3.1.46) ot 1 ae: Cone _ et 0 cae ans 
0 sint 1 cost + sint 


hence 


ta _ + (cost —sint —2sint 
3.1.47) e =e( sint cost +sint} ° 


As was shown in Chapter 2, §2.6, if A € M(n,C) has n distinct eigenvalues, 
hen C” has a basis of eigenvectors. If A has multiple eigenvalues, C” might or 
might not have a basis of eigenvectors, though as shown in §2.7 of Chapter 2, there 
will be a basis of generalized eigenvectors. If v is a generalized eienvector of A, say 


3.1.48) (A-—AD™v =0, 
hen 
© i 
3.1.49) eae a a Ave AD)‘, 
k<m 
so 
tA tr if k 
3.1.50) eu Se > pia AI)*v. 
k<m 


EXAMPLE 3. Consider the 3 x 3 matrix A used in (2.7.28) of Chapter 2: 


Pe ae 
(3.1.51) A=[0 2 3 
001 


Here 2 is a double eigenvalue and 1 a simple eigenvalue. Calculations done in 
Chapter 2, §2.7, yield 


1 0 1 6 
(3.1.52) (A—2I){0}]=0, (A—27){1]=3]0], (A-D{-3] =o. 
0 0 0 1 
Hence 
1 e2t 
(3.1.53) Alo] = 0 |, 
0 0 
0 1 tk 0 
(3.1.54) 41) = eS °_(A-2n* [1 
0 k=0 0 
0 3t 
(3.1.55) = e*1l1}4+ [0 
0 0 
and 
6 6 
(3.1.56) eA | 3) =e | -3 
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Note that 
0 6 1 0 
(3.1.57) eA lo] =e'4[—-3] —6e'4 10] 4+ 3e4 [1 
1 1 0 0 


Putting these calculations together yields 


e7* 3te?" Ge’ — Ge?! + 9te?* 
(3.1.58) &f4={[ oo ec —3e! + 3¢2 
0 0 et 


EXAMPLE 4. Consider the 3 x 3 matrix 


1 2 0 
(3.1.59) A=/]3 1 3). 
0 -2 1 


A computation gives det(AJ — A) = (A—1)°. Hence for N = A—TI we have 
Spec(N) = {0}, so we know N is nilpotent (by Proposition 2.8.1 of Chapter 2). In 
fact, a calculation gives 


0 2 0 6 0 6 
(3.1.60) N=({(3 0 3), N?={0 0 0], N°=0. 
0 -2 0 -6 0 -6 


Hence 
2 
et — et [7 +tN+ 5h 


(3.1.61) 143t? 2t +3? 
=e! 3t 1 3t 
—3t? -2t 1-32? 


eC 
Exercises 


1. Use the method of eigenvalues and eigenvectors given in (3.1.23)—(3.1.26) to 
compute e’4 for each of the following: 


a=(0 3). 4-G 4). 4-G )4-G 9). 


2. Use the method given in (3.1.48)—(3.1.50) and illustrated in (3.1.51)-(3.1.61) to 
compute e’4 for each of the following: 


i Si 124 
A=(q q}), 4=(01 3), 4=(0 2 0 
001 
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3. Show that 
(At) — 4B t= > AB = BA. 
Hint. Set X(t) = eA+), y(t) = e*4e'®. Show that 
X=Y = X'(t)—Y'(t) = Bett) — cA Be? = 0, 
and hence that 


X=Y = Be'4A=c'4B, Vt. 


4. Given A € M(n,C), suppose ®(t) is an n x n matrix valued solution to 


Show that 

H(t) = e'4B, 
where B = ®(0). Deduce that ®(¢) is invertible for all ¢ € R if and only if ®(0) is 
invertible, and that in such a case 


e'—9)A — O(t)B(s)“1. 


(For a generalization, see (3.8.13).) 


5. Let A,B € M(n,C) and assume B is invertible. Show that 
(B-'AB)* = B-1A*B, 


and use this to show that 14 
eB IAB _ p-letap 


6. Show that if A is diagonal, i.e., 


ann 


then 


ef 


efann 


Exercises 7-10 bear on the identity 
(3.1.62) det e'4 = eb TA, 
given A € M(n,C). 


7. Show that if (3.1.62) holds for A = A) and if Ag = B-1A,B, then (3.1.62) holds 
for A= Apo. 


8. Show that (3.1.62) holds whenever A is diagonalizable. 
Hint. Use Exercises 5-6. 
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9. Assume A € M(n,C) is upper triangular: 
2h Gin 
(3.1.63) A= 
ann 
Show that e'4 is upper triangular, of the form 
e11(t) wee. €in(t) 
f= on : j 1€9; (Ee 
€nn(t) 
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10. Deduce that (3.1.62) holds when A has the form (3.1.63). Then deduce that 
(3.1.62) holds for all A € M(n,C). 


11. Let A(t) be a smooth function of t with values in M(n,C). Show that 
d 

(3.1.64) A(0) =0 = —e4| = A’(0). 
dt t=0 


Hint. Take the power series expansion of e4™), in powers of A(t). 


12. Let A(t) be a smooth M(n,C)-valued function of t € J and assume 


(3.1.65) A(s)A(t) = A(t)A(s), Vs,t € I. 
Show that 
(3.1.66) SA = A'(t)e4M = AM A(t), 


Hint. Show that if (3.1.65) holds, 


d aw — 4 a(s)—a(t) AW) 
ae = as e 


s=t 


and apply Exercise 11. 


13. Here is an alternative approach to Proposition 3.1.2. Assume 


(3.1.67 A,Beée M(n,C), AB=BA. 
Show that 

ae . , ! 
(3.1.68 (A+ By" =~ Ch) Ai B™-i, (") Se geese, 

Fore j ji(m— 9)! 
From here, show that 

A+B m 
€ Ss — (A+B) 
(3.1.69) Sean 
=>» a pM BOL 
m=0 j=0 5l(m - j)! 
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Then take n = m — j and show this is 


= jm? B 


j=0 n=0 


(3.1.70) 15 lan 
=D 74 wae 

j=0 n=0 

= eAe8 , 


sO 


(3.1.71) eAtB — ee? 


14. As an alternative to the proof of (3.1.4), given in (3.1.3), which depends on 
term by term differentiation of power series, verify that, for A € M(n,C), 


d tA _ 1: 1 (t+h)A tA 
Pe ear G ae, 
i 
tA }: hA 
(3.1.72) =e lim > (e"" — 1) 
=e'AA 
= ActA 


the second identity in (3.1.72) by (3.1.71), the third by the definition (3.1.2), and 
the fourth by commutativity. 


3.2. Exponentials and trigonometric functions 


In Chapter 1 we have seen how to use complex exponentials to give a self-contained 
treatment of basic results on the trigonometric functions cost and sint. Here we 
present a variant, using matrix exponentials. We begin by looking at 


0 -1 
(3.2.1) x(t)=e%r, J= é 0 ) ; 
which solves 
(3.2.2) z(t) = Ja(t), 2(0) = 29 € R’. 


We first note that the planar curve x(t) moves about on a circle centered about the 
origin. Indeed, 


< |n(tl? = < (w(t) - 2) = 2'(t) a(t) +a) -2"(t) 
(3.2.3) = Ju(t) - x(t) + 2(t) - Je(t) 
=0, 
since J‘ = —J. Thus |lx(t)|| = ||xol| is constant. Furthermore the velocity v(t) = 
x(t) has constant magnitude; in fact 
(3.2.4) lo) IP = vt) - w(t) = Jat) - Je(t) = lle@)IP, 


since Jé'J = —J? =I. 
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For example, 


(ni) =°" (9) 


is a curve, moving on the unit circle «7 +23 = 1, at unit speed, with initial position 
x(0) = (1,0) and initial velocity v(0) = (0,1)'. Now in trigonometry the functions 
cost and sint are defined to be the x; and x2 coordinates of such a parametrization 
of the unit circle, so we have 


(3.2.6) ) ack (i) 


The differential equation (3.2.2) then gives 


d . d . 
(3.2.7) 7 cost = —sint, di sint = cost. 


(0) <e19(8) =204(2) 


we have a formula for e'7 (%), which together with (3.2.6) yields 


Using 


ty _ (cost —sint\ _ ; k 
(3.2.8) eS fe nee ) = (cost)I + (sint) J. 
Then the identity e+%7 = ee’ yields the following identities, when matrix 
multiplication is carried out: 

cos(s + t) = (cos s)(cost) — (sin s)(sin¢), 


28) sin(s +t) = (coss)(sint) + (sin s)(cost). 


We now show how the treatment of sint and cost presented above is really 
quite close to that given in Chapter 1, $1.1. To start, we note that if C is regarded 
as a real vector space, with basis e; = 1, e2 = i, and hence identified with R?, via 


3.2.10 z=atiyo GE 


hen the matrix representation for the linear transformation z +> iz is given by J: 


3.2.11 iz=—ytie, J (5) = ee . 
Yy x 


fore generally, the linear transformation z+> (c+ 7%s)z has matrix representation 


3.2.12 (: ,) : 
Ss Cc 


Taking this into account, we see that the identity (3.2.8) is equivalent to 
3.2.13 e* = cost + isint, 
which is Euler’s formula, as in (1.1.39) of Chapter 1. 


Here is another approach to the evaluation of e'7. We compute the eigenvalues 
and eigenvectors of J: 


(3.2.14) "1 = 1, AQ => a; Uu= ‘e 5 v2 = (7) < 
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Then, using the fact that e’/v, = e*v,, we have 
1 A ese 14/1 
tt a tt = p7tt 
(3.2.15) e (3) = 56 e + 3° (3) , 
Comparison with (3.2.6) gives 
sg ; T ye . 
(3.2.16) cost = ae +e"), sint = —(e** —e7"), 


again leading to (3.2.13). 


Exercises 


1. Recall Skew(n) and SO(n), defined by (2.11.7) and (2.12.4) of Chapter 2. Show 
that 


(3.2.17) A€ Skew(n) = e'4 € SO(n), VtER. 
Note how this generalizes (3.2.3). 


2. Given an n x n matrix A, let us set 


sae : ier : 
(3.2.18) costA = se +e A) sintA= ae =e=t4), 
f 
Show that 
d P d. 
(3.2.19) —costA=—AsintA, —sintA = AcostA. 
dt dt 


3. In the context of Exercise 2, show that 


CO 7 4\k oo _4)\k 
(3.2.20) cost = Garay sata = > oe eat 


Deduce that 
Aeé M(n,R) = costA, sintAe€ M(n,R), VtER. 


4. Show that 
Av = Av => (costA)u = (costa)u, 


(sintA)v = (sin ta)v. 
5. Compute costA and sintA in each of the following cases: 
0 1 0 -1 1 1 0 4 
a= g) 49 Go) 4a (0 a) 49°C 9g): 
6. Suppose A € M(n,C) and 


0 —A 
B= (‘ ‘ ) € M(2n,C). 
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Show that 


tp _ (costA —sintA 
(3.2.21) CF = a cos tA ) : 


Hint. Denoting the right side of (3.2.21) by X(t), show that 
X'(t) = BX(t). 


3.3. First order systems derived from higher order equations 


There is a standard process to convert an nth order differential equation 


d’y d”-ly dy 
3.1 + On — He = 

(3.3.1) ain a ere +a, a + agy = 0 
to a first order system. Set 
(3.3.2) xo(t) = y(t), v(t) = y'(t),...,¢n—1(t) = y(t). 
Then x = (20,...,2%n—1)* satisfies 

Ly = 24 
(3.3.3) 

U2 = n-1 
Dea = —An-1%n-1 —*** — a0Xo, 
or equivalently 
dx 
3.4 —=Arz, 

(3.3.4) = = Ae, 
with 

0 1 0 0 

0 0 0 0 
(3.3.5) A= : : a : : 

0 OO -:: 0 1 

—ag —@, *°: TAn-2 —An-1 


The matrix A given by (3.3.5) is called the companion matri« of the polynomial 
(3.3.6) p(A) = AP Fan—AX 1 +++ Fad + ap. 


Note that a direct search of solutions to (3.3.1) of the form e*’ leads one to 
solve p(A) = 0. Thus the following result is naturally suggested. 


Proposition 3.3.1. If p(A) is a polynomial of the form (3.3.6), with companion 
matriz A, given by (3.3.5), then 


(3.3.7) p(A) = det(AI — A). 
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Proof. We look at 


A -l 0 0 

0 r 0 0 
(3.3.8) Sie ee 

0 0 Xr —1 

a 4, ++) Ang AFGn-1 


and compute its determinant by expanding by minors down the first column. We 
see that 


(3.3.9) det(AI — A) = \ det(AI — A) + (—1)"~!ao det B, 
where 


(3.3.10) A is the companion matrix of \"~! 4+ a,_,\"~? +--+. ay, 


B is lower triangular, with — 1’s on the diagonal. 


By induction on n, we have det(AJ — A) = NT + any A"? +--+ + a1, while 
det B = (—1)"~1. Substituting this into (3.3.9) gives (3.3.7). 


Converse construction 


We next show that each solution to a first order n x n system of the form (3.3.4) 
for general A € M(n,F)) also satisfies an nth order scalar ODE. Indeed, if (3.3.4) 
holds, then 


3.3.11) af) = Ag®-) =... = Akg, 
Now if p(A) is given by (3.3.7), and say 
3.3.12) p(A) = A” + an A" ++ Fad +409, 
then, by the Cayley-Hamilton theorem (cf. (2.8.17) of Chapter 2), 
3.3.13) p(A) = A" + an 1A" 1 +--+ +a,A+ aol = 0. 
Hence 
oe) = Ay 

(3.3.14) = —a,_, A" le —-++ — ay Ax — agz 

= anya) —..-— aya’ — agx, 


so we have the asserted nth order scalar equation: 


(3.3.15) e + anya") +... +a,a' tax = 0. 


REMARK. If the minimal polynomial g(\) of A has degree m, less than n, we can 
replace p by qg and derive analogues of (3.3.14)—(3.3.15), giving a single differential 
equation of degree m for a. 
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a a | 
Exercises 


1. Using the method (3.3.12)—(3.3.15), convert 


dx (1 1 
dt \0 2)” 
into a second order scalar equation. 


2. Using the method (3.3.2)—(3.3.3), convert 
y" —3y! +2y =0 


into a 2 x 2 first order system. 


In Exercises 3-4, assume that dj is a root of multiplicity k > 2 for the polynomial 
p(A) given by (3.3.6). 


3. Verify that e**, te", ...,t*-4e*1" are solutions to (3.3.1). 


4. Deduce that, for each j = 0,...,k4 —1, the system (3.3.3) has a solution of the 
form 


(3:81) a(t) = ( tat? +--+ +4 Bea, 
(with v depending on j). 


5. For given A € M(n,C), suppose x’ = Az has a solution of the form (3.3.16). 
Show that A; must be a root of multiplicity > 7 + 1 of the minimal polynomial of 
A. 


Hint. Take into account the remark below (3.3.15). 


6. Using Exercises 3-5, show that the minimal polynomial of the companion matrix 
A in (3.3.5) must be the characteristic polynomial p()). 


3.4. Nonhomogeneous equations and Duhamel’s formula 


In §§3.1-3.3 we have focused on homogeneous equations, 2’ — Ax = 0. Here we 
consider the nonhomogeneous equation 


(3.4.1) a —Ax= f(t), 2x(0)=20 €C”. 


Here A € M(n,C) and f(t) takes values in C”. The key to solving this is to 
recognize that the left side of (3.4.1) is equal to 


(3.4.2) eA “ (etait), 
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as follows from the product formula for the derivative and the defining property of 
e!4, given in (3.1.4). Thus (3.4.1) is equivalent to 


d 


3.4.3 <(cMa(t)) =e f(t), 2(0) = 20, 


and integration yields 


t 
3.4.4 e Ax (t) = a9 + | e *4 f(s) ds. 
0 
Applying e’4 to both sides then gives the solution: 
t 
3.4.5 a(t) = e4x9 +f e'—9)A F(s) ds. 
0 


This is called Duhamel’s formula. 


EXAMPLE. We combine methods of this section and §3.3 (and also §3.2) to solve 


3.4.6) y"+y=f), yO)=y, y(0) =m. 
As in §3.3, set x = (#0, 71) = (y, y’), to obtain the system 


d vTo\ _ 0 1 Xo 0 
n a(n) =(2. 0) (2) * (xe): 
Recognizing the 2 x 2 matrix above as —J, and recalling from §3.2 that 


3.4.7) e(e-t)F ee -—t) -sin(s— ”) 


sin(s—t) cos(s—t) 


we obtain 
3.4.8) ; Wickens Si, chlo. 
(3) = (Sone ee) CR) +f Gnome ne) (sto) 
and hence 
(3.4.9) y(t) = (cost)yo + (sint)y1 + vk sin(t — s) f(s) ds. 


Use of Duhamel’s formula is a good replacement for the method of variation of 
parameters, discussed in §1.14 of Chapter 1. See §3.9 of this chapter for more on 
this. See also §3.B for connections with the Laplace transform. 


Next, we briefly discuss a variant of the method of undetermined coefficients, 
introduced for single second-order equations in §1.10 of Chapter 1. We consider the 
following special case, the first-order n x n system 

di 
(3.4.10) a — Ax = (cosot)v, 


given 0 € R, v € R”, and A € M(n,R) (or we could use complex coefficients). We 
assume 


(3.4.11) io, —io ¢ Spec A, 
and look for a solution to (3.4.10) of the form 
(3.4.12) Lp(t) = (cosot)a + (sinot)b, a,b eR”. 
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Substitution into (3.4.10) leads to success with 

a=—A(A® +o1)"10, 

b= —o0(A? +o1)~1v. 

If (3.4.11) does not hold, (3.4.13) fails, and (3.4.10) might not have a solution of 


he form (3.4.12). Of course, (3.4.5) will work; (3.4.10) will have a solution of the 
form 


3.4.13) 


t 
3.4.14 s@= f (cos as)e—*)4y ds. 
0 


When (3.4.11) holds and (3.4.12) works, the general solution to (3.4.10) is 
3.4.15 x(t) = e'4ug + (cosot)a + (sinat)b, uo € R®, 

uo related to x(0) by 

3.4.16 x(0) = uo +4. 


If all the eigenvalues of A have negative real part, e’4ug will decay to 0 as t > 
+oo. Then e’4ug is called the transient part of the solution. The other part, 
cos ot)A + (sinot)b, is called the steady state solution. 


etl 
Exercises 


1. Given A € M(n,C), set 


Verify that E£/,(t) = AE,—1(t) and that 


d -tA Pie sat 
Gy (Exe y= Atte 4, 
2. Verify that, if A is invertible, 


t 
0 


3. Solve the initial value problem 


da 0 1 et 0 
a ~ (1 a)**(e) #0 (9) 
4. Solve the initial value problem 


z-(t Ber(2): -0-Q) 
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5. Solve the initial value problem 


#-(, Ser(2): -0-Q) 


6. Produce analogues of (3.4.7)—(3.4.9) for 
y" — 3y' + 2y = f(t), y(0)=yo, y/(0) =m. 


In Exercises 7-8, take X,Y € M(n,C) and 


(3.4.17) U(t,s) =e(X+8Y), U(t, 8) = out, s). 


7. Show that U, satisfies 
OU, 
Ot 


=(X+sY)U.+YU, U,(0,8) =0. 


8. Use Duhamel’s formula to show that 
t 
U.(t, 8) = elon ree Veer) ar: 
0 
Deduce that 


(3.4.18) a x+ev 


1 
=e* e-TXVe™ dr. 
ds 0 0 


s= 


9. Assume X(t) is a smooth function of t € I with values in M(n,C). Show that, 
forte I, 


1 
(3.4.19) fe) = exe | e XOX (eX dr, 
0 


10. In the context of Exercise 9, assume 
tte T= = X(t)X(t') = X(t) X(t). 


In such a case, simplify (3.4.19), and compare the result with that of Exercise 12 
in §3.1. 


3.5. Simple electrical circuits 


Here we extend the scope of the treatment of electrical circuits in §1.13. Rules 
worked out by Kirchhoff and others in the 1800s allow one to write down a system 
of linear differential equations describing the voltages and currents running along 
a variety of electrical circuits, containing resistors, capacitors, and inductors. 


There are two types of basic laws. The first type consists of two rules known 
as Kirchhoff’s laws: 
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(A) The sum of the voltage drops around any closed loop is zero. 
(B) The sum of the currents at any node is zero. 


The second type of law specifies the voltage drop across each circuit element: 


(a) Resistor: V=IR, 


(b) Inductor: V= oe 
Q 
Cc tor: V=—. 
(c) apacitor a 


In each case, V is the voltage drop (in volts), I is the current (in amps), R is 
the resistance (in ohms), L is the inductance (in henrys), C' is the capacitance (in 
farads), and Q is the charge (in coulombs). We refer to §1.13 of Chapter 1 for 
basic information about these units. The rule (c) is supplemented by the following 
formula for the current across a capacitor: 


dQ 
2 l= —. 
(c2) 7 
In (b) and (c2), time is measured in seconds. 
Rules (A), (B), and (a) give algebraic relations among the various voltages and 
currents, while rules (b) and (c)—(c2) give differential equations, namely 


(3.5.1) ua =V (Inductor), 
(3.5.2) ow =I (Capacitor). 


Note that (3.5.2) results from applying d/dt to (c) and then using (c2). If a circuit 
has k capacitors and @ inductors, we get an m x m system of first order differential 
equations, with m= k + £. 


We illustrate the formulation of such differential equations for circuits presented 
in Figure 3.5.1 and Figure 3.5.2. In each case, the circuit elements are numbered. 
We denote by V; the voltage drop across element j and by J; the current across 
element 7. 


Figure 3.5.1 depicts a classical RLC circuit, such as treated in §1.13. Rules 
(A), (B), and (a) give 


Vi + Vo + V3 = E(t), 
(3.5.3) h=h=1s, 
Vi = RI. 
Equations (3.5.1)—(3.5.3) yield a system of two ODEs, for I3 and Vo: 


(3.5.4) L—2=V3, C—=h. 
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Figure 3.5.1. RLC circuit 


We need to express V3 and J, in terms of 3, V2, and E(t), using (3.5.3). In fact, 


we have 


V3 = B(t)- Vi — Vo = E(t) — Rl, — Vo = E(t) — RI 


(3.5.5) as 


so we get the system 


dl, 
L— = —RI3 — Vo + E(t), 
dt 
(3.5.6) 
cil 
or, in matrix form, 

d (I3\ _ (—R/L —-1/L\ (Is 1 (E(t) 
ae) di (i:) = ( GG sO. ad Bk. OogS 
Note that the characteristic polynomial of the matrix 

_ (-R/L —-1/L 
(3.5.8) A= ( L/C 0 ) 
is 
R 1 

2 pa: 33 
(3.5.9) + Dt te 
with roots 

R 1 L 

5.1 nN + 2—4—, 

ey 2L° 28 Hf Cc 


Va, 


Let us now look at the slightly more complicated circuit depicted in Figure 
3.5.2. Again we get a 2 x 2 system of differential equations. Rules (A), (B), and 
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w 


Figure 3.5.2. Another circuit 


(a) give 
Vi+V2+V4= E(t), V2=V3, 
(3.5.11) jae oer ee pe 
V3 = R3l3, Va = Ral. 


Equations (3.5.1)—(3.5.2) yield differential equations for J; and V2: 


dV; dl 
(3.5.12) C2 =h, ere =V. 


We need to express Jy and V; in terms of V2,), and E(t), using (3.5.11). In fact, 


we have 


(ee an ne: 
(3.5.13) ie ene aco © ei R3 


R; 


V, = E(t) — V2 — Va = E(t) — V2 — Ralg = E(t) — Ve 


so we get the system 


d 1 
(3.5.14) a 3 
i. = —-V2— Ral, + E(t), 


or, in matrix form, 


cm: Sacre ae 


Rah, 
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1 ohm 5h. 


1 ohm 


Figure 3.5.3. Circuit with two capacitors 


————————SSSaae 
Exercises 


1. Work out the 3 x 3 system of differential equations describing the behavior of 
the circuit depicted in Figure 3.5.3. Assume 


E(t) =5 sin12t volts. 


2. Using methods developed in §3.4, solve the 2 x 2 system (3.5.7) when 
R=5 ohms, L=4 henrys, C=1 farad, 
and 
E(t) =5 cos2t volts, 
with initial data 
I3(0) =0 amps, V2(0) =5. volts. 


3. Solve the 2 x 2 system (3.5.15) when 
R3=1 ohm, R4g=4o0hms, L=4henrys, C=2 farads, 


and 
E(t) = 2 cos2t volts. 


4. Use the method of (3.4.10)—(3.4.13) to find the steady state solution to (3.5.7), 
when 


E(t) = A cosot. 


Take A, o, R and L fixed and allow C to vary. Show that the amplitude of the 
steady state solution is maximal (we say resonance is achieved) when 


1 
LC =<, 


recovering calculations of (1.13.7)—(1.13.13) in Chapter 1. 
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Figure 3.6.1. Spring system 


5. Work out the analogue of Exercise 4 with the system (3.5.7) replaced by (3.5.15). 
Is the condition for resonance the same as in Exercise 4? 


6. Draw an electrical circuit that leads to a 4 x 4 system of differential equations, 
and write down said system. 


3.6. Second order systems 


Interacting physical systems often give rise to second order systems of differen- 
tial equations. Consider for example a system of n objects, of mass m1,...,™Mn, 
connected to each other and to two walls by n+ 1 springs, with spring constants 
ky,...,kn41, as in Figure 3.6.1. We assume the masses slide without friction. De- 
note by x; the position of the jth mass and by y; the degree to which the jth spring 
is stretched. The equations of motion are 


(3.6.1) mya = —kyyjy +hypiyjta, LS ji <n, 
and for certain constants a;, 
Yj =Uj—Tjq1+4aj;, 27K, 


(3.6.2) 
Yr=X1 +A, Ynt1 = —Ln + Gn41- 


Substituting (3.6.2) into (3.6.1) yields an nxn system, which we can write in matrix 
form as 


(3.6.3) Ni a He 
where x = (@1,..-,%n)', b= (—kya1 + keae,.-.,-kndn + kn41@n41)*, 
my 
(3.6.4) M= a ‘ 
Mn 
and 
kitk,  —kg 
—kp ko +kg 
(3.6.5) K= 
kn-1 t+ kn kn 
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We assume mj; > 0 and k; > 0 for each j. Then clearly M is a positive definite 
matrix and K is a real symmetric matrix. 


Proposition 3.6.1. If each k; > 0, then K, given by (3.6.5), is positive definite. 


Proof. We have 


(3.6.6) ao Ke = 5h; + hejg1)aj — 257 kyaegiay. 
j=l j=2 
Now 
(3.6.7) 220 j-12; < tf 4 + ee 
sO 
n nt+1 
a-Ka> > kya? + > kjao 

j=l j=2 

(3.6.8) 


n n 
pe pe 
- y kya; - y kj x5_y 
j=2 j=2 


> kat + kn41@2. 
Furthermore note that the inequality in (3.6.7) is strict unless 7;_; = x; so the 
inequality in (3.6.8) is strict unless 7;_; = x; for each j € {2,...,n}, ie., unless 
“1 =-::=2Zpy. This proves that «- Ka > 0 whenever z € R", « £0. 


To be more precise, we can sharpen (3.6.7) to 
3.6.9) 225-12; = a4 oh x5 = (xj _ 2j-1)", 


and then (3.6.8) is sharpened to 


n 


3.6.10 w+ Ka kyo? + king? + 5” bya; —2j-1)°. 
j=2 
If we set 
3.6.11 K=min{ky:1<j<n+l}, 
then (3.6.10) implies 
3.6.12 aKa >w(att+a2+ D(a; — 2;-1)*). 
j=2 


The system (3.6.3) is inhomogeneous, but it is readily converted into the ho- 
mogeneous system 


(3.6.13) Mz" =-Kz, z=a—K71b. 
This in turn can be rewritten 

(3.6.14) 2! =—-M"'Kz. 

Note that 


(3.6.15) L=M?KM—-1? = MK = M71?LM!?, 


168 3. Linear systems of differential equations 


where 
1/2 
(3.6.16) Mi? = 
1/2 
Mn 


Proposition 3.6.2. The matrix L is positive definite. 


Proof. «+ La = (M~‘'/2a). K(M~‘/22) > 0 whenever x # 0. 


According to (3.6.15), M~1K and L are similar, so we have: 


Corollary 3.6.3. For M and K of the form (3.6.4)-(3.6.5), with mj;,k; > 0, the 
matrix M~lK is diagonalizable, and all its eigenvalues are positive. 


It follows that R” has a basis {v1,..., Un} satisfying 


3.6.17 M7'Ku; =Xju;, Aj > 0. 
Then the initial value problem 
3.6.18 Mz" =-Kz, 2(0)=24, “O0=u4 
has the solution 

“ Bj 
3.6.19 z(t) dla cos Ajt + x ind) en 


where the coefficients a; and 8; are given by 


3.6.20 20 = > QAGU;, ZL = > By¥- 
An alternative approach to the system (3.6.14) is to set 


3.6.21 u= M22, 
for which (3.6.14) becomes 
3.6.22 u” = —Lu, 


with L given by (3.6.15). Then R” has an orthonormal basis {w; : 1 < j < n}, 
satisfying 


3.6.23 Lw; = rw; namely w; = My, 
with v; as in (3.6.17). Note that we can set 

3.6.24 L=A?, Aw; =djw,;, 

and (3.6.22) becomes 

3.6.25 ul” + Au =0. 


One way to convert (3.6.25) to a first order (2n) x (2n) system is to set 
3.6.26 v=Au, w=u'. 
Then (3.6.25) becomes 


d [uv v 0 A 
ag #(*)ax(*), x=(% 4) 
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It is useful to note that when X is given by (3.6.27), then 
tx _ [{ costA  sintA 
ee) = € sintA cos 2) : 


where costA and sintA are given in Exercises 6—7 in §3.2. One way to see this is 
to let ®(t) denote the right side of (3.6.28) and use (3.2.19) to see that 
d 


(3.6.29) P(t) = XO), #0) = (4 DL 


Then Exercise 4 of §3.1 implies e’* = ®(t). These calculations imply that the 
solution to (3.6.25), with initial data u(0) = uo, u’(0) = ur, is given by 

(3.6.30) u(t) = (costA)ug + A} (sin tA)uz. 

Compare (3.6.18)—(3.6.20). This works for each invertible A € M(n,C). 


We move to the inhomogeneous variant of (3.6.14), which as above we can 
transform to the following inhomogeneous variant of (3.6.25): 


3.6.31 u"” + A?u= f(t), u(0) =u, u’(0) = uw. 
Using the substitution (3.6.26), we get 

d (v\ _ v 0 v(0)\ _ (Auo 
pee dt (*) ~ x(*) ss Ga Co =< ( uy) 


Duhamel’s formula applies to give 


wey (=e (te) ferns 


Using the formula (3.6.28) for e“*, we see that the resulting formula for v(t) in 
3.6.33) is equivalent to 


t 
3.6.34 u(t) = (costA)ug + A} (sintA)uy + i A~'sin(t — s)A f(s) ds. 
0 


This is the analogue of Duhamel’s formula for the solution to (3.6.31). 


Coupled springs with friction 


We now return to the coupled spring problem and modify (3.6.1)—(3.6.2) to 
allow for friction. Thus we replace (3.6.1) by 


(3.6.35) mis = —kyy; + ky 41yy = dja’, 


where y; are as in (3.6.2) and d; > 0 are friction coefficients, Then (3.6.3) is replaced 
by 


(3.6.36) Ma" =—Ka-—Dz' +b, 

with b as in (3.6.3), M and K as in (3.6.4)—(3.6.5), and 
dy 

(3.6.37) D= ie , dj; >0. 
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As in (3.6.13), we can convert (3.6.36) to the homogeneous system 


(3.6.38) Mz" =-Kz-—D2', z=x2-—K7'b. 
If we set u = M1/22z, as in (3.6.21), then, parallel to (3.6.22)—(3.6.24), we get 
(3.6.39) u" + Bu + A?u=0, 
where A? is as in (3.6.24), with L = M~'/?kK M~1/?, as in (3.6.22), and 
d,/m4 
3.6.40 BEM pi? = 
dn,/™n 


The substitution (3.6.26) converts the n x n second order system (3.6.39) to the 
2n) x (2n) first order system 


d fv 0 A vu 
0 é(e)-(% 4)(*). 
We can write (3.6.41) as 


3.6.42 < e =(X+Y) (*) 


with X as in (3.6.27) and 
3.6.43 — € y ) ‘ 


Note that 


0 —AB 0 0 
3.6.44 xv=(5 4 : yx=(s4 aE 


so these matrices do not commute. Thus e!(*+¥) might be difficult to calculate 
even when A and B commute. Such commutativity would hold, for example, if 
my =+++ = mM, and d, = --- = dy, in which case B is a scalar multiple of the 
identity matrix. 


When the positive, self adjoint operators A and B do commute, we can make 
he following direct attack on the system (3.6.39). We know (cf. Exercise 3 in §2.11, 
Chapter 2) that R” has an orthonormal basis {w1,...,w,} for which 


3.6.45) Aw; = Ajj, Bw; = Qj W;, Aj. bj > 0. 

Then we can write a solution to (3.6.39) as 

3.6.46) u(t) = + u,;(t)w;, 

where the real-valued coefficients u,(t) satisfy the equations 

3.6.47) uy + Qugus + Aju; =0, 

with solutions that are linear combinations: 
eit (ay cos y/A? — pit + 6; sin \/XF — p? ). Aj > by, 

(3.6.48) eit (aye VM + Byen Cara Ss 


eH (ag + Byt), Ay = My- 
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These three cases correspond to modes that are said to be underdamped, over- 
damped, and critically damped, respectively. 


In cases where A and B do not commute, analysis of (3.6.39) is less explicit, 
but we can establish the following decay result. 


Proposition 3.6.4. If A,B € M(n,C) are positive definite, then all of the eigen- 


values of Z = ( 


A : 
Zh 7) have negative real part. 


Proof. Let’s say (v,w)! £0 and Z(v, w)! = A(v, w)’. Then 


3.6.49 Aw=v, Av+ Bw=-d)Au, 
and 
3.6.50 (Z(v, w)*, (v, w)') = —(Bu, w) + [(Aw, v) — (Av, w)], 
while also 
3.6.51 (Z(v, w)*, (v,w)*) = A(lloll2 + [lw ]2). 
The two terms on the right side of (3.6.50) are real and purely imaginary, respec- 
tively, so we obtain 
3.6.52 (Re A)(|lv||? + |lwl]?) = —(Bu, w). 


If (v,w)! # 0, we deduce that either ReA < 0 or w = 0. If w = 0, then (3.6.49 
gives Av = 0, hence v = 0. Hence w 4 0, and ReA < 0, as asserted. 


SS 
Exercises 


1. Find the eigenvalues and eigenvectors of 
0 1 0 
101 
0 1 0 


2. Use the results of Exercise 1 to find the eigenvalues and eigenvectors of K, given 
by (3.6.5), in case n = 3 and 


3. Find the general solution to 
ul" + Bu’ + A?u=0, 


in case A? = K, with K as in Exercise 2, and B = I. 


4. Find the general solution to 


” 01), 1 0 = 
U +(; a) + (4 i) u=0, 
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5. Generalizing the treatment of (3.6.25), consider 
(3.6.53) u”+Lu=0, Le M(N,C). 


Assume C% has a basis of eigenvectors v;, such that Lu; = NU;; Aj EC, Aj #0. 
Show that the general solution to (3.6.53) has the form 


N 
(3.6.54) u(t) = So (aje** + Bje™")v;, 05,8; € C. 
=I 


How is this modified if some A; = 0? 


6. Find the general solution to 


and to 


3.7. Curves in R° and the Frenet-Serret equations 
Given a curve c(t) = (x(t), y(t), z(t)) in 3-space, we define its velocity and acceler- 
ation by 
(3.7.1) v(t) =c(t), a(t) =v'(t) =c"(t). 
We also define its speed s’(t) and arclength by 
(3.7.2) =I, = f “sl(r) de, 
t 


oO 
assuming we start at t = to. We define the unit tangent vector to the curve as 
_ v(t) 
lw@I 


Henceforth we assume the curve is parametrized by arclength. 


(3.7.3) T(t) 


We define the curvature K(s) of the curve and the normal N(s) by 


(3.7.4) K(s) = <I e = K(s)N(s). 
Note that 
(3.7.5) T(s)-T(s) =1 => T'(s)- T(s) = 0, 


so indeed N(s) is orthogonal to T(s). We then define the binormal B(s) by 
(3.7.6) B(s) =T(s) x N(s). 
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Figure 3.7.1. Frenet frame at a point on a 3D curve 


For each s, the vectors T's), N(s) and B(s) are mutually orthogonal unit vectors, 
known as the Frenet frame for the curve c(s). See Figure 3.7.1 for an illustration. 
Rules governing the cross product yield 


(3.7.7) T(s) = N(s) x B(s), N(s) = B(s) x T(s). 
(For material on the cross product, see the exercises at the end of §2.12 of Chapter 
2.) 


The torsion of a curve measures the change in the plane generated by T(s) and 
N(s), or equivalently it measures the rate of change of B(s). Note that, parallel to 
(3.7.5), 


B(s)- B(s) = 1 = > B'(s)- B(s) =0. 

Also, differentiating (3.7.6) and using (3.7.4), we have 
(3.7.8) B'(s) = T'(s) x N(s) +T(s) x N’(s) = T(s) x N'(s) => B'(s)-T(s) = 0. 
We deduce that B’(s) is parallel to N(s). We define the torsion by 
dB 
gs 

We complement the formulas (3.7.4) and (3.7.9) for dT’/ds and dB/ds with one 
for dN/ds. Since N(s) = B(s) x T(s), we have 


dN dB dT 
= xT+Bx =TNxT+«BxQN, 
ds ds ds 


(3.7.9) —1(s)N(s). 


(3.7.10) 
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or 


(3.7.11) oe —K(s)T(s) + 7(s)B(s). 


Together, (3.7.4), (3.7.9) and (3.7.11) are known as the Frenet-Serret formulas. 


EXAMPLE. Pick a,b > 0 and consider the helix 
(3.7.12 c(t) = (acost, asint, bt). 
Then v(t) = (—asint, acost, b) and ||v(¢)|| = Va? + 62, so we can pick s = tV'a? + b? 


to parametrize by arc length. We have 


1 
3.7.13 T(s) = ——=~(-asint, acost, b), 
hence 
dT 1 
3.7.14 ES ge 7p! acost, —asint, 0). 
By (3.7.4), this gives 
a ; 

3.7.15 K(s) = Pee’ N(s) = (— cost, — sint, 0). 
Hence 
3.7.16 B(s) =T(s) x N(s) = bsint, —bcost, a). 

(8) =T(s) x Ns) = aa ) 
Then 

dB 1 ‘i 
3.7.17 as = ae pe Poost, baint, 0), 
so, by (3.7.9), 
b 

3.7.18 T(s) = 2Le 


In particular, for the helix (3.7.12), we see that the curvature and torsion are 
constant. 


Let us collect the Frenet-Serret equations 
dT _ 
ds 
dN 


ds. 
dB 


ds 


for a smooth curve c(s) in R*, parametrized by arclength, with unit tangent T(s), 
normal N(s), and binormal B(s), given by 


— —TN 


(3.7.20) N(s) = roel Bis) =T(s)X Nis) 
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The basic existence and uniqueness theory, which will be presented in Chapter 
4, applies to (3.7.19). If «(s) and r(s) are given smooth functions on an interval 
I = (a,b) and so € J, then, given Ty, No, Bo € R®, (7.19) has a unique solution on 
s EI satisfying 
(3.7.21) T(9)'= Ty; N80) Noy B(eo) = Bo: 
In fact, the case when «(s) and 7(s) are analytic will be subsumed in the material 
of §3.10 of this chapter. We now establish the following. 


Proposition 3.7.1. Assume « and T are given smooth functions on I, with k > 0 
on I. Assume {To, No, Bo} is an orthonormal basis of R°, such that Bo = To x No. 
Then there exists a smooth, unit-speed curve c(s), s € I, for which the solution to 
(3.7.19) and (3.7.21) is the Frenet frame. 


To construct the curve, take T(s), N(s), and B(s) to solve (3.7.19) and (3.7.21), 
pick p € R® and set 


(3.7.22) c(s) =pt+ 1s T(c) do, 


so T(s) = c’(s) is the velocity of this curve. To deduce that {T(s), N(s), B(s)} is 
the Frenet frame for c(s), for all s € J, we need to know: 


(3.7.23) {T(s), N(s), B(s)} orthonormal, with B(s) =T(s) x N(s), Vsel. 
In order to pursue the analysis further, it is convenient to form the 3 x 3 

matrix-valued function 

(3.7.24) F(s) = (T(s), N(s), B(s)); 

whose columns consist respectively of T(s), N(s), and B(s). Then (3.7.23) is 

equivalent to 

(3.7.25) F(s) € SO(3), VseEl, 


with SO(3) defined as in (2.12.4) of Chapter 2. The hypothesis on {T, No, Bo} 
stated in Proposition 3.7.1 is equivalent to Fo = (To, No, Bo) € SO(3). Now F‘(s) 
satisfies the differential equation 


(3.7.26) F'(s) = F(s)A(s), F (so) = Fo, 
where 
0 —k(s) 0 
(3.7.27) A(s) = | K(s) 0 —T(s) 
0 T(s) 0 
Note that 
(3.7.28) “ = A(s)*F(s)* = —A(s)F(s)*, 
since A(s) in (3.7.27) is skew-adjoint. Hence 
 r(s)F(sy" sae + F( = 
(3.7.29) = F(s)A(s)F(s)* — F(s)A(s)F(s)* 
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Thus, whenever (3.7.26)—(3.7.27) hold, 
(3.7.30) FoF* =I => F(s)F(s)* =I, 
and we have (3.7.23). 


Let us specialize the system (3.7.19), or equivalently (3.7.26), to the case where 
« and 7 are constant, i.e., 


0 -K 0 
(3.7.31) F'(s)=F(s)A, A=|« O -T], 
0 TF 0 


with solution 
(3.7.32) F(s) = Fp els s0)A | 


We have already seen in that a helix of the form (3.7.12) has curvature « and torsion 
T, with 


a b 
(3.7.33) ae as a aD TS 2 
and hence 

K ¥ 
(3.7.34) Oat 0 Goa, oot 


In (3.7.12), s and t are related by t = sVK? +7?. 


We can also see such a helix arise via a direct calculation of e*4, which we now 
produce. First, a straightforward calculation gives, for A as in (3.7.31), 


3.7.35 det(AI — A) = \(\? +4? +77), 
hence 
3.7.36 Spec(A) = {0, tiv Kk? + 77}. 


An inspection shows that we can take 


1 T 0 1 —K 
Ol iot vy, = ———__ [0], vw=]1 , 3= —— 0 ; 
n/n Oe a Wr Veer 
and then 


3.7.38 Avi =0, Avg = VK24+7203, Avg = —VK24+-7?2 v9. 
In particular, with respect to the basis {v2,v3} of V = Span{v2, v3}, Aly has the 
matrix representation 


o> =a 
pan 2 2 
(3.7.39) Ba=Vne4+7 (; a 


We see that 

(3.7.40) &4yu, =u, 
while, in light of the calculations giving (3.2.8), 
e®4u9 = (cos sVK2 + 72)09 +(sin s 
e®4y3 = —(sin sV/K2 + 72)u2+(cos 5 


(3.7.41) Me 


VK24+72 
VK2472 


)vs. 
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Exercises 


1. Consider a curve c(t) in R®, not necessarily parametrized by arclength. Show 
that the acceleration a(t) is given by 

d?: ds\2 
(3.7.42) a(t) = “274+ r( ~ ) N. 


Hint. Differentiate v(t) = (ds/dt)T(t) and use the chain rule dT'/dt = (ds/dt)(dT/ds), 
plus (3.7.4). 


2. Show that 


(3.7.43) KB = ae 
Ile 


Hint. Take the cross product of both sides of (3.7.42) with T’, and use (3.7.6). 


3. In the setting of Exercises 1-2, show that 
(3.7.44) Kr|lul|® = —a- (v x a’). 
Deduce from (3.7.43)—(3.7.44) that 


(vxa)-a’ 
TA eee 
ia 7 Tex al? 
Hint. Proceed from (3.7.43) to 
Cia ag 3dB_ d 7 ; 
F (allel?) B+ allo = F(0 x a) = 0x a, 


and use dB/dt = —r(ds/dt)N, as a consequence of (3.7.9). Then dot with a, and 
use a: N = x]|v||?, from (3.7.42), to get (3.7.44). 


4. Consider the curve c(t) in R? given by 
c(t) = (a cost, b sint, t), 


where a and 0 are given positive constants. Compute the curvature, torsion, and 
Frenet frame. 

Hint. Use (3.7.43) to compute « and B. Then use N = B x T. Use (3.7.45) to 
compute T. 


5. Suppose c and ¢ are two curves, both parametrized by arc length over 0 < s < L, 
and both having the same curvature «(s) > 0 and the same torsion T(s). Show 
that there exit zo € R® and A € O(3) such that 


é(s) = Ac(s) +29, Vs € [0, L]. 


Hint. To begin, show that if their Frenet frames coincide at s = 0, i.e., T(0) = 


T(0), N(0) = .N(0), B(O) = B(0), thn T=T, N=N, B=B. 
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6. Suppose c is a curve in R? with curvature « > 0. Show that there exists a plane 
in which c(t) lies for all t if and only if 7 = 0. 

Hint. When 7 = 0, the plane should be parallel to the orthogonal complement of 
B. 


3.8. Variable coefficient systems 


Here we consider a variable coefficient n x n first order system 
(3.8.1) —=A(t)r, x(to) =x €C", 


and its inhomogeneous analogue. The general theory, which will be presented in 
Chapter 4, implies that if A(t) is a continuous function of t € I = (a,b) and to € I, 
then (3.8.1) has a unique solution x(t) for t € J, depending linearly on xo, so 


(3.8.2) x(t) = S(t,to)xo, S(t,to) € L(C"). 

See §3.10 of this chapter for power series methods of constructing S(t, to), when 
A(t) is analytic. As we have seen, 

(3.8.3) A(t) = A= S(t, to) = eA” 


However, for variable coefficient equations there is not such a simple formula, and 
the matrix entries of S(t, s) can involve a multiplicity of new special functions, such 
as Bessel functions, Airy functions, Legendre functions, and many more. We will 
not dwell on this here, but we will note how S(t, to) is related to a “complete set” 
of solutions to (3.8.1). 


Suppose 21(t),...,2%,(t) are n solutions to (3.8.1) (but with different initial 
conditions). Fix to € I, and assume 


(3.8.4) x1(to),-.-,2n(to) are linearly independent in C”, 


or equivalently these vectors form a basis of C”. Given such solutions x;(t), we 
form the n x n matrix 


(3.8.5) M(t) = (ai(t),.-.,@n(t)), 

whose jth column is x,(t). This matrix function solves 
dM 

(3.8.6) Tyee A(t) M(t). 


The condition (3.8.4) is equivalent to the statement that M(to) is invertible. We 
claim that if M solves (3.8.6) and M(to) is invertible then M(t) is invertible for all 
t € I. To see this, we use the fact that the invertibility of 1/(t) is equivalent to the 
non-vanishing of the quantity 


(3.8.7) W(t) = det M(t), 
called the Wronskian of {x1(t),...,2n(t)}. It is also notable that W(t) solves a 


differential equation. In general we have 


(3.8.8) “ det M(t) = (det M(t)) Tr(M(t)7'M'(t). 
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(See Exercises 1-3 below.) Let I C I be the maximal interval containing to on 
which M(t) is invertible. Then (3.8.8) holds for t € [. When (3.8.6) holds, we have 


Tr(M(t)~1M’(t)) = Tr(M(t)-1 A(t) M(t)) = Tr A(t), 


so the Wronskian solves the differential equation 


(3.8.9) a = (Tr A(t)) W(t). 
Hence 
(3.8.10) W(t) =e) W(s),  O(t, =f Tr A(r) dr. 


This implies I =I and hence gives the asserted invertibility. From here we obtain 
the following. 


Proposition 3.8.1. If M(t) solves (3.8.6) fort € I and M(to) is invertible, then 
(3.8.11 S(t, to) = M(t)M(to)', ~Vte T. 


Proof. We have seen that M(t) is invertible for all t € I. If x(t) solves (3.8.1), set 
(3.8.12 y(t) = M(t)~* a(t), 

and apply d/dt to x(t) = M(t)y(t), obtaining 

dx 


A'(t)y(t) + M(t)y'(t) 


dt 
one = A(t)M(t)y(t) + M(t)y'(t) 
= A(t)a + M(t)y'(t). 
If x(t) solves (3.8.1), this yields 
(3.8.14) uw =0, 


hence y(t) = y(to) for all t € J, ie., 
(3.8.15) M(t)~!x(t) = M(to)7'2(to). 
Applying M(t) to both sides gives (3.8.11). 


Note also that, for s,t € I, 


(3.8.16) S(t, s) = M(t)M(s)~! 
gives S(t, s)a(s) = x(t) for each solution x(t) to (3.8.1). We also have 
(3.8.17) S(t, to) = S(t,s)S(s,to), S(t,s) = S(s,t)7'. 


There is a more general version of the Duhamel formula (3.4.5) for the solution 
to an inhomogeneous differential equation 


(3.8.18) — =A(t)r+ f(t), x(to) = 20. 


To solve (3.8.18), set x(t) = M(t)y(t), as in (3.8.12). This time, (3.8.13) yields the 
identity M(t)y’(t) = f(0), or 


u = M(t)'f(t), y(to) = Misia ea: 
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(3.8.19) a(t) = M(t)M(to) ‘ao + M(t) i M(s)~' f(s) ds, 


0 


for invertible M(t) as in (3.8.5)—(3.8.6). Equivalently, 


(3.8.20) x(t) = S(t, to) xo +f S(t, s) f(s) ds. 


We note that there is a simple formula for the solution operator S(t, s) to (3.8.1) 
in case the following commutativity hypothesis holds: 


3.8.21 A(t) A(t’) = A(’)A(t), Vt, ET. 
We claim that if 
t 
3.8.22 B(t,s) = -| A(r) dr, 
then 
3.8.23 (3.8.21) => < (eMtt*a(t)) = eB(+5) (x/(t) — A(t) x(t), 
from which it follows that 
3.8.24 (3.8.21) => S(t, s) = ele AM 4, 


This identity fails in the absence of the hypothesis (3.8.21).) 
To establish (3.8.23), we note that (3.8.21) implies 

3.8.25) B(t,s)B(t',s) = B(t',s)B(t,s),  Vs,t,t’ eT. 

Next, 


«1 / Berns) _ B(ts) 
(3.8.25) => lim z(e e ) 


h-0 


= tim 1 Ble) (eRtrn)-Be) -1) 
(3.8.26) hOB 
= —eP4s) A(t) 


d B(t,s) _ B(t,s) 
=> ae =-e A(t), 
from which (3.8.23) follows. 


Here is an application of (3.8.24). Let x(s) be a planar curve, on an interval 
about s = 0, parametrized by arc-length, with unit tangent T(s) = x'(s). Then the 
Frenet-Serret equations (3.7.1) simplify to T’ = KN, with N = JT, i.e., to 


(3.8.27) T'(s) = K(s) JT(s), 
with J as in (3.2.1). Clearly the commutativity hypothesis (3.8.21) holds for A(s) = 
K(s)J, so we deduce that 
(3.8.28) T(s) =eT(0),  A(s) = | K«(T) dr. 
0 
Recall that e'7 is given by (3.2.8), i.e., 


(3.8.29) tT a —sin ) 


sint cost 
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We return to the general (noncommutative) case, and record the following 
estimate on the operator norm of S(t, s). 


Proposition 3.8.2. Assume that for t € I = (a,b), 


(3.8.30) Re(A(t)v, v) < K]lv||?, Vo eC”. 
Then 
(3.8.31) a<s<t<b=> |S(t,s)|| <e%**), 


Proof. If u(t) = S(t, s)v(s), then w(t) = e~*¢-*) u(t) solves 

(3.8.32) — =C(t)u(t), C(t) = A(t) -—K. 

It suffices to show that ||w(t)|| is monotonically decreasing. Indeed, 
© |rw(d)|P? = 2Re(w'(#), w(t) 


(3.8.33) = 2Re(C(t)w(t), w(t) 
< 0, 


and we have the result. 


We now combine Proposition 3.8.2 and Duhamel’s formula to estimate the 
difference between solutions x(t) and y(t) to 


dx 

—=A(t)x, x(t) =v, 
(3.8.34) ie : 

= By, ylto) =v. 


Let us assume that 

Re(A(t)w, w) < K||w]l?, 
(3.8.35) (A(d)w, w) I i 
Re(B(t)w,w) < Kw’, 


for all w € C”. Now, given (3.8.34), our goal is to estimate z(t) = a(t) — y(t), for 
t > to. We have 


(3.8.36) dt 
= A(t)z+ (A) — BOly,  x(to) = 0, 


so, with S(t, s) denoting the solution operator to (3.8.1), we have 
t 
(3.8.37) 2(t) = / S(t, s)[A(s) — B(s)]y(s) ds. 
to 
Proposition 3.8.2 gives 


Ily(s)l| < eC fol], 


3.8.38 
|S(t,s)|| <eX°-9), for tp <8 <t, 


so we have the following conclusion. 
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Proposition 3.8.3. Assume that x and y solve (3.8.34) and that (3.8.35) holds. 
Then 


(3.8.39) lett) — v(t) < eK -)( f(s) - BGs) Ids) lol, 


to 


fort > to. 


SSS 7] 
Exercises 


Exercises 1-3 lead to a proof of the formula (3.8.8) for the derivative of det M(t). 


1. Let A € M(n,C). Show that, as s > 0, 
det(I + sA) = (1+ 8ay1)--- (1+ sann) + O(s) 
=1+s TrA+O(s”), 
hence 


d 
a det(I 4 sA)|,_o =TrA. 


2. Let B(s) be a smooth matrix-valued function of s, with B(0) = I. Use Exercise 
1 to show that 


d 
2 det B(s)|,_, = Tr B’(0). 


Hint. Write B(s) = I + sB’(0) + O(s?). 


3. Let C(s) be a smooth matrix-valued function, and assume C(0) is invertible. 
Use Exercise 2 plus 


det C(s) = (det C(0)) det B(s), B(s) = C(0)~'C(s), 
to show that 
© act C(s)|,-9 = (det C(0)) Tr C(0)~*C"(0). 
Use this to prove (3.8.8). 
Hint. Fix t and set C(s) = M(t+4 s), so 


d d 
a det M(t) = =F det C(s)| ,—o- 


4. Show that, if 1/(t) is smooth and invertible on an interval J, then 
d 
qu =—M(t) 1M’ (t)M(t)71. 

Then apply d/dt to (3.8.12), to obtain an alternative derivation of Proposition 3.8.1. 

Hint. Set U(t) = M(t)~! and differentiate the identity U(t)M(t) = I. 
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5. Set up and solve the Wronskian equation (3.8.9) in the following cases: 


w=) Ga) (a): 


Exercises 6-7 generalize (3.8.27)—(3.8.29) from the case of zero torsion (cf. Exercise 
6 of §3.7) to the case 


(3.8.40) T(t) = 6x(t), 6 constant. 


6. Assume z(t) is a curve in R® for which (3.8.40) holds. Show that x(t) = 
x(0) + a T(s) ds, with 


(3.841) (T(t), N(t), BE) = (TO), NO), BO) e?*, 
where 
0 -1 O t 
(3.8.42) K= ( 0 >) , a(t) =| K(s) ds. 
0 Bp O 7 


Hint. Use (3.7.26)—(3.7.27) and (3.8.21) = (3.8.22)-(3.8.24). 


7. Let €1,€2,e3 denote the standard basis of R°, and let 
v1 = (14+ 6?) /?(Be, +3), vz =e2, v3 =(1+8")-*/?(e1 — Bes). 
Show that v1, v2, v3 form an orthonormal basis of R* and, with K as in (3.8.42), 
Kv, =0, Kua =—(1+ B?)/?03, Kuz = (1+ B?)!/?09. 


Deduce that 
e*u, = 04, 


e?* uy = (cosn)v2 — (sinn)v3, 


eK us = (sin 7)v2 + (cos 7)u3, 


where 7 = (14+ 8?)!/20. 


8. Given B € M(n,C), write down the solution to 


da tB 
Hint. Use (3.8.24). 


Exercises 9-10 deal with a linear equation with periodic coefficients: 
(3.8.43) —=A(t)z, A(t+1)= A(t). 
Say A(t) € M(n,C). 


9. Assume M(t) solves (3.8.6), with A(t) as in (3.8.43), and M(0) = J. Show that 
(3.8.44) M(1l)=C => M(t+) =M(QDC. 
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10. In the setting of Exercise 9, we know M(t) is invertible for all t, so C is 
invertible. Results of Appendix 3.A yield X € M(n,C) such that 


(3.8.45) e* =C. 
Show that 
(3.8.46) P(t) = M(t)e** = P(t+1) = P(t). 


The representation 
(3.8.47) M(t) = P(t)e'* 
is called the Floquet representation of M(t). 


3.9. Variation of parameters and Duhamel’s formula 


An inhomogeneous equation 
3.9.1 y” + a(t)y’ + b(t)y = f(t) 


can be solved via the method of variation of parameters, if one is given a complete 
set ui(t), ue(t) of solutions to the homogeneous equation 


3.9.2 uj + a(t)ui + b(t)uy = 0. 


The method (derived already in §1.12 when a(t) and b(t) are constant) consists of 
seeking a solution to (3.9.1) in the form 


3.9.3 y(t) = v1 (t)ur (t) + v2(t)ue(t), 


and finding equations for v;(t) which can be solved and which work to yield a 
solution to (3.9.1). We have 


/ / / / / 
3.9.4 Yo = VU + VQUg + VU + VgU2- 


To proceed, we introduce a “trick”: we impose the condition 


3.9.5 vyuy + vgu2 = 0. 

Then y” = viui, + vgug + vu + veug, and plugging in (3.9.2) gives 
3.9.6 y” =vjuy t+ vguy — (au) + bur)v1 — (aus + bu2)v2, 
hence 

3.9.7 y" +ay’ + by = vyul + vbuy. 


Thus we have a solution to (3.9.1) in the form (3.9.3) provided v{, and v5 solve 


/ £ 
UjU, + VguU2 = 0, 


3.9.8 


Pop a 
VpUy + UZUy = f. 


This linear system for vj, v5 has the explicit solution 


3.9.9 u=-afh, vu= sf, 


where W(t) is the Wronskian: 


3.9.10) W = uyuy — ugu, = det & a) : 
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Then 
v1(t) --| inf) as + Ch 
(3.9.11) ch, 
v9(t) = wi yte)as + Cy 
So 
(3.9.12) y(t) = Cyuz(t) + Cgua(t) 4 | [u2(t)us(s) — uy (t)u2(s)] ae ds. 


Note that 
7 ui(s) U2(s) 
ua(t)ur(s) — ur (t)ue(s) = act ( ne at) : 


We can connect the formula (3.9.12) with that produced in §3.8 as follows. If 
y(t) solves (3.9.1), then a(t) = (y(t), y/(t))! solves the first order system 


dx 0 
9.1 — = A(t): ; 
3.9.13 Fr @2+ (i) 
where 
0 1 
3.9.14 A(t) = (0 a6) , 
and a complete set of solutions to the homogeneous version of (3.9.13) is given by 
ay = (ui) _ 
3.9.15 x(t) = ie o pe 1,2, 
Thus we can set 
3.9.16 M(t) = (3 a) 


and as in (3.8.19) we have 


(3.9.17) Gy = M(t)M(to)~' ta + M(t) ‘A M(s)7? ery ds, 


solving (3.9.13) with y(to) = yo, y/(to) = y1. Note that 
-1_ _ 1 [ us(s) —ua(s) 
(3.9.18) Me)" = ay oa nts) ) 


with W(s), the Wronskian, as in (3.9.10). Thus the last term on the right side of 
(3.9.17) is equal to 


eo (20 20) Les Grn) & 


and performing this matrix multiplication yields the integrand in (3.9.12). Thus 
we see that Duhamel’s formula provides an alternative approach to the method of 
variation of parameters. This time, no “tricks” are involved. 
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a a | 
Exercises 


1. Use the method of variation of parameters to solve 


(3.9.20) y” +y=tant. 


2. Convert (3.9.20) to a 2 x 2 first order system and use Duhamel’s formula to 
solve it. Compare the result with your work on Exercise 1. Compare also with 


(3.4.6)-(3.4.9). 


3. Do analogues of Exercises 1-2 for each of the following equations. 


(a) y"+y=e, 
(b) y" +y=sint, 
(c) y"+y=t, 
(@) y’+y=?. 


4. Show that the Wronskian, defined by (3.9.10), satisfies the equation 


dW 
a eat 
—— =~ a(t), 


if uy and wp solve (3.9.2). Relate this to (3.8.9). 


(3.9.21) 


5. Show that one solution to 

(3.9.22) u” + 2tu' + 2u =0 
is 

(3.9.23) u(t) =e". 


Set up and solve the differential equation for W(t) = uju4 — wgu,. Then solve the 
associated first order equation for u2, to produce a linearly independent solution 
ug to (3.9.22), in terms of an integral. 


6. Do Exercise 5 with (3.9.22) replaced by 
u” +2u’+u=0, 
one of whose solutions is 
uz(t) =e~*. 
3.10. Power series expansions 


Here we produce solutions to initial value problems 


(3.10.1) —=A(t)r+ f(t), x(0) = 20, 
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in terms of a power series expansion 
co 
(3.10.2) x(t) =a +ait+at? ++. = Sat 
k=0 


under the hypothesis that the n x n matrix-valued function A(t) and vector-valued 
function f(t) are given by power series, 
(3.10.3) A(t) = 50 Ast®, fO=>_ fet*, 

k=0 k=0 
convergent for |t| < Ro. The coefficients x, in (3.10.2) will be obtained recursively, 
as follows. Given x(t) of the form (3.10.2), we have 


oe) co 


dx 1 


(3.10.4) aT So kagt®—? = SO (k + Dacesit®, 
k=1 k=0 
and 
A(t)x = S> Ajt? SO aot! 
(3.10.5) fe ghia 
= > es Ag.j2j)t 
k=0 j=0 


so the power series on the left and right sides of (3.10.1) agree if and only if, for 
each k > 0, 


k 
(3.10.6) (K+ Vans = >) An—jay + See 
j=0 
In particular, the first three recursions are 
x, = Apxo + fo, 
(3.10.7) 222 = Ajxo + Aor + fi, 
323 = Agro + Aya + Aor + fo. 


To start the recursion, the initial condition in (3.10.1) specifies xo. 
We next address the issue of convergence of the power series thus produced for 
x(t). We will establish the following. 


Proposition 3.10.1. Under the hypotheses given above, the power series (3.10.2) 
converges to the solution x(t) to (3.10.1), for |t] < Ro. 


Proof. The hypotheses on (3.10.3) imply that for each R < Ro, there exist a,b € 
(0,00) such that 


(3.10.8) |Acll <aR-*, [fall SOR“, VRE Zt. 
We will show that, given r € (0, R), there exists C € (0,00) such that 
(3.10.9) lel << Cr4, VgeZt. 


Such estimates imply that the power series (3.10.2) converges for |t| < r, for each 
r < Ro, hence for |t| < Ro. 
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We will prove (3.10.9) by induction. The inductive step is to assume it holds 
for all 7 < k and to deduce that it holds for 7 = k +1. This deduction proceeds as 
follows. We have, by (3.10.6), (3.10.8), and (3.10.9) for 7 < k, 


k 
(E+ Vllentall $ So Aw-sll- llesll + I Fell 

j=0 

(3.10.10) <aCS> Ri-*r-3 +oR-* 
j=0 
HE j 
= -k as -k 
aCr dle) +bR". 
j=0 
Now, given 0<r< R, 
ke 00 F 
r\k-d r\d 1 
3.10.11 BES 2>2(5) = pe = MIB) < 00. 
j=0 j=0 R 
Hence 
3.10.12 (k + 1)|lansil| < aCM(R,r)r—* + br-*. 
We place on C the constraint that 
3.10.13 C>b, 
and obtain 
aM(R,r) +1 Ke 

3.10.14 llvxa1l] < eet -Cr-*} 
This gives the desired result, 
3.10.15 |zxaal] < Cr7*-, 
as long as 
3.10.16 Ae ae 

k+1 
Thus, to finish the argument, we pick kK € N such that 
3.10.17 K+1> [aM(R,r) + 1]r. 
Recall that we have a, R,r, and M(R,r).) Then we pick C € (0,00) large enough 
that (3.10.9) holds for all 7 € {0,1,..., A}, ive., we take (in addition to (3.10.13)) 
3.10.18 C> max ri |lax;ll. 

O<jsk 


Then for all k > K, the inductive step yielding (3.10.15) from the validity of (3.10.9 
for all j < k holds, and the inductive proof of (3.10.9) is complete. 


For notational simplicity, we have discussed power series expansions about t = 0 
so far, but the same considerations apply to power series about a more general point 
tg. Thus we could replace (3.10.1) by 


dx 


(3.10.19) a 


Alte + f(t), x(to) = 0, 
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with A(t) and f(t) given by power series 
(3.10.20) A(t) = 3 Ap(t—to)*, f(t)= >— felt — to)’, 
k=0 
for |t — to| < Ro, and find x(t) in the form 
(3.10.21) x(t) = 3 rp(t — to)*. 
k=0 


The recursive formula for the coefficients x, is again given by (3.10.6), and (3.10.8)— 
(3.10.18) apply without further change. 


It is worth noting that, in (3.10.20)—(3.10.21), 
(3.10.22) Ap 7A (to), ic af to), aes 32 (to). 


These formulas, say for fx, arise as follows. Setting t = to in (3.10.20) gives 
fo = f(to). Generally, if the power series for f(t) converges for |t — to| < Ro, so 
does the power series 


(3.10.23) f= 3 kfx(t — to)-}, 
k=1 
and more generally, 
(3.10.24) FOO(8) = Yo k(R— A) (hm VD fall — to), 
k=n 


and setting t = to in (3.10.24) gives f(™) (tg) =n! fn. 
As an aside, we mention that a convenient way to prove (3.10.23) is to define 
g(t) to be the power series on the right side of (3.10.23) and show that 


(3.10.25) J 90o)s= 30 falt = to)! = fe) = F(t. 
to k=l 


Compare the proof of Proposition 1.C.4. 
We next establish the following important fact about functions given by con- 
vergent power series. 


Proposition 3.10.2. If f(t) is given by a power series as in (3.10.20), convergent 
for t satisfying |t—to| < Ro, then f can also be expanded in a power series int—ty, 
for each point ty € (to — Ro, to + Ro), with radius of convergence Ry — |tp — ty]. 


Proof. For notational simplicity, we take t) = 0. Thus we assume |t;| < Ro. For 
|s| < Ro — |ti|, we have 


(3.10.26) 
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the second identity by the binomial formula. Call the last double series >, > 5 Ojk: 
Note that 


2 llesell = = Soin (' I \s||ty|8-3 


k=0 j=0 
(3.10.27) 
= 5 lls + |til)* 
k=0 
< oO, 


given |s| + |t1| < Ro. In such a case, we can reverse the order of summation, and 
write 


co hUk co (00 
(3.10.28) er = ue: 


all series being absolutely convergent. Hence 


(3.10.29) f(t: ts) (So fe( ar 1\3 


j=0 k=j 


is an absolutely convergent series, as long as |s| < Ro — |ty|. 


A (vector-valued) function f defined on an interval I = (a,b) is said to be an 
analytic function on I if and only if for each t; € J, there is an r; > 0 such that 
for |t — ti| < r1, f(t) is given by a convergent power series in t — t;. Parallel to 
(3.10.22), such a power series is necessarily of the form 


(3.10.30) f(t) = 3 FOO) =H)" 


It follows from Proposition 3.10.2 that if f(t) is given by a convergent power series 
in t — to for |t — to| < Ro, then f is analytic in the interval (to — Ro, to + Ro). 


The following is a useful fact about analytic functions. 


Lemma 3.10.3. If f is analytic on (a,b) anda <a < 6B < b, then there exists 
5 >0 such that, for each ty € [a, 6], the power series (3.10.30) converges whenever 
|t _ t| <6. 


Proof. By hypothesis, each p € [a, §] is the center of an open interval J, on which f 
is given by a convergent power series about p. Let (1/2)J, denote the open interval 
centered at p whose length is half that of I,. Since [a, 6] is a closed, bounded 
interval, results of Appendix 4.B (see Proposition 4.B.9) imply that there exists a 
finite set {p1,...,p«K} C [a, 6] such that the intervals (1/2)I,, cover [a, 6]. Take 


1 


(3.10.31) = 7 min, 


HG, 5 


Then each t; € [a, 6] is contained in some (1/2)J,,, and hence |ti—p;| < (1/4)@(p,), 
so 


(3.10.32) (t1 —5,t1 +6) CI, 
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Hence the convergence of the power series for f about t; on (t; — 6,t1 +6) follows 
from Proposition 3.10.2. 


With these tools in hand, we have the following result. 


Proposition 3.10.4. Assume A(t) and f(t) are analytic on an interval (a,b) and 
to € (a,b). Then the initial value problem (3.10.19) has a unique solution x(t), 
analytic on (a, 6). 


Proof. Let I Cc (a,b) be the maximal interval on which the solution z(t) exists 
and is analytic, say I = (a, 8). If 6 < b, take 6 as in Lemma 10.3 for [to, 6]. Take 
t; = 6 — 6/2. (If necessary, shrink 6 to arrange that t; > to.) Then A(t) and f(t) 
have convergent power series about t;, with radius of convergence > 6, so 
(3.10.33) x(t) extends analytically to [t1,t1 +6), and t; +6 > . 


This contradiction proves that 6 = b, and a similar argument gives a@ = a, so in 
fact I = (a,b). 


To conclude this section, we mention a connection with the study of functions 
of a complex variable, which the reader could pursue further, consulting texts on 
complex analysis, such as [4], or Chapter 7 of [47]. Here is the general setup. Let 
Qc C be an open set, and f:Q— C. We say f is complex differentiable at zo € Q 
provided 
(3.10.34) lim L@o+w) = F(z) 

w0 Ww 
exists. Here, w — 0 in C. If this limit exists, we call it 


(3.10.35) f'(20). 


We say f is complex differentiable on 2 if is complex differentiable at each zo € 2. 


The relevance of this concept to the material of this section is the following. 
If f(t) is given by the power series (3.10.20), absolutely convergent for real t € 
(to — Ro, to + Ro), then 


(3.10.36) f@)= Yo fle =t)" 
k=0 


is absolutely convergent for z € C satisfying |z — to| < Ro, i.e., on the disk 
(3.10.37) Dr, (to) = {z EC:|z- to| < Ro}, 


and it is complex differentiable on this disk. Furthermore, f’ is complex differen- 
tiable on this disk, etc., including the kth order derivative f®, and 


f® to) 
(3.10.38) f= or 


More is true, namely the following converse. 


Theorem 3.10.5. Assume f is complex differentiable on the open setQ C C. Let 
to € Q and assume Dr, (to) C Q. Then f is given by a power series, of the form 
(3.10.36), absolutely convergent on Dp, (to). 
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This is one of the central basic results of complex analysis. A proof can be 
found in Chapter 5 of [4], and in Chapter 2 of [47]. In view of Theorem 3.10.5, 
complex differentiable functions are also called complex analytic. 


EXAMPLE. Consider 


1 


(3.10.39) fO=ay7 


This is well defined except at +i, where the denominator vanishes, and one can 
readily verify that f is complex differentiable on C\{i, —i}. It follows from Theorem 
3.10.5 that if to € C \ {i,—-i}, this function is given by a power series expansion 
about to, absolutely convergent on Dr(to), where 


(3.10.40) R=min{|to — il, |to + a}. 


In particular, 
(3.10.41) to €R=> R= )/t24+1 


gives the radius of convergence of the power series expansion of 1/(z? + 1) about 
to. This is easy to see directly for to = 0: 


(3.10.42) aes ene 


However, for other tp € R, it is not so easy to see directly that this function has a 
power series expansion about to with radius of convergence given by (3.10.41). The 
reader might give this a try. 


To interface this example with Proposition 3.10.4, we note that, by this propo- 
sition, plus the results just derived on 1/(z? + 1), the equation 


(3.10.43) o = ¢ e os) a+ (oy «(0) = (4) 


has a solution that is analytic on (—oo,0o). The power series expansion for the 

solution x(t) about to converges for |t| < 1 if t9 = 0 (this is an easy consequence of 

Proposition 3.10.4 and (3.10.42)), and for other to € R, it converges for |t — to| < 
12 +1 (as a consequence of Proposition 3.10.4 and Theorem 3.10.5). 


See Appendix 3.C for a further discussion of complex analytic functions. 
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a a | 
Exercises 


1. Consider the function g : C \ {0} + C given by 
(3.10.44) ga =e”, 
Show that g is complex differentiable on C \ {0}. Use Theorem 3.10.5 to deduce 
that h: R > R, given by 
hae", t£0, 


(3.10.45) 
0, t=0, 


is analytic on R \ {0}. Show that h is not analytic on any interval containing 0. 
Compute 


n“*) (0). 
2. Consider the Airy equation, 
3.10.46 y" =ty, y(0)=yo, yO) =m, 
introduced in (1.15.9). Show that this yields the first order system 
d. 
3.10.47 a =(Aj+Ait)z, 2(0) = 20, 
with 
_—f0 1 _ (0 0 _ [Yo 
3.10.48 Ao _ é i) 3 Ay = (; 4 ; wo = (“) . 
Note that 
3.10.49 At Ae 
and 
1 0 0 0 
3.10.50 Ag A; = € 9) , AiAo = (; ) : 
3. For a system of the form (3.10.47), whose solution has a power series of the form 
3.10.2), the recursion (3.10.6) becomes 
3.10.51) (k + L)apqa = Aor, + Ai rp-1, 
with the convention that z_; = 0. Assume (3.10.49) holds. Show that 
1 1 1 


Note that when Ag and A; are given by (3.10.48), this becomes 


1 ial 
(3.10.53) Th+3 = 74g Ce 2) Lk. 
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Establish directly from (3.10.52) that the series )> 2,t” is absolutely convergent for 
all t. Hint. Separately tackle the three series 


co 
(3.10.54) > ageyj6t3, 7 =0,1,2. 
£=0 
Use (3.10.52) and the ratio test to show that each one converges for all t. 


3.11. Regular singular points 


Here we consider equations of the form 


dx 
AL t— = A(t): 
(3.1.1) = = Ale, 


where x takes values in C”, and A(t), with values in M(n,C), has a power series 
convergent for t in some interval (—Tp, To), 


(3.11.2) A(t) = Ap + Ait + Agt? +++. 


The system (3.11.1) is said to have a regular singular point at t = 0. One source of 
such systems is the following class of second order equations: 


(3.11.3) tu!’ (t) + th(t)u’(t) + c(t)u(t) = 0, 


where b(t) and c(t) have convergent power series for |t| < To. In such a case, one 
can set 


3.11.4 a(t) = ( 


obtaining (3.11.1) with 


0 1 
3.11.5 A(t) = (a) i ¥) 


A paradigm example, studied in Section 1.16, is the Bessel equation 


@u 1 du v 
3.11.6 we tiat(i- wleHo 
which via (3.11.4) takes the form (3.11.1), with 
7 ; (0 1 ft 
mi Adee, ao0(% 3, wel, 9). 


It follows from Proposition 3.10.4 that, given to € (0,70), the equation (3.11.1), 
with initial condition x(to) = vo, has a unique solution analytic on (0,79). Our 
goal here is to analyze the behavior of x(t) as t \, 0. 


A starting point for the analysis of (3.11.1) is the case A(t) = Ap, ie., 


(3.11.8) 2 = Aga. 
The change of variable z(s) = x(e*) yields 

dz 
(3.11.9) aa Agz(s), 


with solution 


(3.11.10) 2(s) = eeu, v = 2(0), 
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hence 

(3.11.11) a(t) = eS H4oy = t40y,  £>0, 

the latter identity defining t4°, for t > 0. Compare results on the Euler equations 
in Exercises 1-3, §1.15. 


Note that if v € E(Apo, A), then t4ov = tv, which either blows up or vanishes 
as t \, 0, if ReA < 0 or ReA > 0, respectively, or oscillates rapidly as t \, 0, if A is 
purely imaginary but not zero. On the other hand, 


(3.11.12) v € N(Ao) troyv=v,. 


It is useful to have the following extension of this result to the setting of (3.11.1). 


Lemma 3.11.1. [fv € N(Ao), then (3.11.1) has a solution given by a convergent 
power series on some interval about the origin, 


(3.11.13) a(t) =aotait+at?+---, wo=, 


as long as the eigenvalues of Ao satisfy a mild condition, given in (3.11.18) below. 


Proof. We produce a recursive formula for the coefficients x, in (3.11.13), in the 
spirit of the calculations of §3.10. We have 


(3.11.14) fee: So kent, 


and 


j>0 £>0 
(3.11.15) k 
= ApXo + x > Ap_pxyt®. 
k>1 £=0 


Equating the power series in (3.11.14) and (3.11.15) would be impossible without 
our hypothesis that Apr = 0, but having that, we obtain the recursive formulas, 
fork > 1, 


k-1 
(3.11.16) kap = Apxp, + 2 Aree, 
£=0 
ie., 
k-1 
3.11.17) (kI — Ao) = S > An—exe. 
£=0 


Clearly we can solve uniquely for x; provided 
3.11.18) Vk EN = {1,2,3,...}, & Spec Ao. 


This is the condition on Spec Ag mentioned in the lemma. As long as this holds, 
we can solve for the coefficients x; for all k € N, obtaining (3.11.13). Estimates on 
hese coefficients implying that (3.11.13) has a positive radius of convergence are 
quite similar to those made in §3.10, and will not be repeated here. 
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First general results 


Our next goal is to extend the analysis arising in the proof of Lemma 3.11.1 
o solutions to (3.11.1), for general A(t), of the form (3.11.2), without such an 
hypothesis of membership in \V(Ag) as made in the lmma. We will seek a matrix- 
valued power series 


3.11.19 U(t) =I+U,t+ Ut? +--- 
such that under the change of variable 
3.11.20 x(t) = U(t)y(t), 


3.11.1) becomes 


3.11.21 t = Ay. 


This will work as long as Ap does not have two eigenvalues that differ by a nonzero 
integer, in which case a more elaborate construction will be needed. 


To implement (3.11.20) and achieve (3.11.21), we have from (3.11.20) and 
3.11.1) that 


d dl 
3.11.22 A(t)U(t)y = < = tU() +tU"(t)y, 


which gives (3.11.21) provided U(t) satisfies 


3.11.23 t— = A(t)U(t) — U(t)Ao. 


3.11.24 t{— = A(tU(t), 

where U takes values in M(n,C) and A(t) takes values in £(M(n, C)); 
A(t)U = A(t)U(t) — U(t) Ao 

3.11.25 f 

= (Ap + Art + Agt® +--+ )U. 


In particular, 


3.11.26 AGU = AGU =U Aj =lAp | SOR, 
the last identity defining C4, € £(M(n,C)). Note that 
3.11.27 U(0) =1EN(Ca,), 


so Lemma 3.11.1 applies to (3.11.24), i.e., to (3.11.23). In this setting, the recursion 
for Ux, k > 1, analogous to (3.11.16)—(3.11.17), takes the form 


k-1 
(3.11.28) kUx = [Ao, Ux] + >> An—jUj, 

j=0 
ie., 

k-1 
(3.11.29) (kI — Ca,)Uk = 95 An—jUj. 


j=0 
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Recall Up = I. The condition for solvability of (3.11.29) for all k € N = {1,2,3,...} 
is that no positive integer belong to SpecCy4,. Results of Chapter 2, §2.7 (cf. Ex- 
ercise 9) yield the following: 


(3.11.30) Spec Ag = {A;} => SpecC4, = {Aj — Ax}. 


Thus the condition that SpecC4, contain no positive integer is equivalent to the 
condition that Ap have no two eigenvalues that differ by a nonzero integer. We 
thus have the following result. 


Proposition 3.11.2. Assume Ag has no two eigenvalues that differ by a nonzero 
integer. Then there exists Ty > 0 and U(t) as in (3.11.19) with power series 
convergent for |t| < Tp, such that the general solution to (3.11.1) fort € (0,T7o) has 
the form 


(3.11.31) a(t) =U(t)t4°v, ver. 


Bessel equation connection 


Let us see how Proposition 3.11.2 applies to the Bessel equation (3.11.6), which 
we have recast in the form (3.11.1) with A(t) = Ag + Agt?, as in (3.11.7). Note 
that 


(3.11.32) Ap = € i => Spec Ap = {v, -v}. 


Thus Spec Ca, = {2v,0,—2v}, and Proposition 3.11.2 applies whenever v is not an 
integer or half-integer. Now as shown in §1.16, there is not an obstruction to series 
expansions consistent with (3.11.31) when v is a half-integer. This is due to the 
special structure of (3.11.7), and suggests a more general result, of the following 
sort. Suppose only even powers of ¢ appear in the series for A(t): 


(3.11.33) A(t) = Ao + Agt? + Aatt +--- 
Then we look for U(t), solving (3.11.23), in the form 
(3.11.34) U(t) =I 4 Uot? + Uat* +---. 


In such a case, only even powers of t occur in the power series for (3.11.23), and in 
place of (3.11.28)—(3.11.29), one gets the following recursion formulas for U2,, k > 
1: 


k-1 

(3.11.35) 2kUox = [Ao, Vox] + D> Aan—2jU2;, 
j=0 

ie., 
k-1 

(3.11.36) (2kI — Cay )U2n =) Arx—2jU 25. 
j=0 


This is solvable for U2; as long as 2k ¢ Spec C'4,, and we have the following. 
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Proposition 3.11.3. Assume A(t) satisfies (3.11.33), and Ap has no two eigenval- 
ues that differ by a nonzero even integer. Then there exists Ty) > 0 and U(t) as in 
(3.11.34), with power series convergent for |t| < To, such that the general solution 
to (3.11.1) for t € (0,7) has the form (3.11.31). 


We return to the Bessel equation (3.11.6) and consider the case vy = 0. That 
is, we consider (3.11.1) with A(t) as in (3.11.7), and 


(3.11.37) fee e i 


Proposition 3.11.3 applies to this case (so does Proposition 3.11.2), and the general 
solution to (3.11.6) is the first entry in (3.11.31), where U(t) has the form (3.11.34). 
Note that in case (3.11.37), A? = 0, so for t > 0, 


1 logt 
Ao — 8 
3.11.38 pS € 1 ) . 
Thus there are two linearly independent solutions to 
3.11.39 Ee eesy 

— dtd’ 
for t > 0, one having the form 
3.11.40 yan 

k>0 


with coefficients a, given recursively, and another having the form 


3.11.41 So (be + cx log t)t?*, 
k>0 


again with coefficients by, and cz given recursively. The solution of the form (3.11.40) 
is as in (1.16.16) (with vy = 0), while the solution of the form (3.11.41) can be shown 
to be consistent with Yo(¢) in (1.16.34)—(1.16.36). 


Relaxed spectral condition on C4, 


Proceeding beyond the purview of Propositions 3.11.2 and 3.11.3, we now treat 
the case when Ap satisfies the following conditions. First, 


(3.11.42) SpecC'4, contains exactly one positive integer, @, 

and second, 

(3.11.43) Ao is diagonalizable, 

which implies 

(3.11.44) C4, is diagonalizable; 

cf. Chapter 2, §2.7, Exercise 8. Later we will discuss weakening these conditions. 


As in Proposition 3.11.2, we use a transformation of the form (3.11.20), ice., 
x(t) = U(t)y(t), with U(t) as in (3.11.19), but this time our goal is to obtain for y, 
not the equation (3.11.21), but rather one of the form 


dy _ 


Ao + Bot® 
dt (Ao + Bet')y, 


(3.11.45) t 
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with the additional property of a special structure on the commutator [Ao, By], 
given in (3.11.55) below. To get (3.11.45), we use (3.11.22) to obtain for U(t) the 
equation 


d 
3.11.46 ( = A(t)U(t) — U(t)(Ao + Bet’), 
in place of (3.11.23). Taking A(t) as in (3.11.2) and U(t) as in (3.11.19), we have 
dU : 
3.11.47 t= S> kU, t*, 
k>1 
k-1 
3.11.48 A(t)U(t) — U(t)Ao = S~[Ao, Ualt® + Soe Ag-3Uj)t*, 
k>1 k>1 j=0 


and hence solving (3.11.46) requires for U;,, k > 1, that 


k-1 
(3.11.49) kUs = [Ao, Uk] + $5 Ax—jUj — Tx, 
j=0 
where 
T;, = 0, Kee: 
(3.11.50) Be, k=l, 
Uxp—-eBe, k>&. 
Equivalently, 
k-1 
(3.11.51) (kI — C4,)Uk = > Ax—jU; —Tr. 
j=0 


As before, (3.11.51) has a unique solution for each k < £, since C4, —k/ is invertible 
on M(n,C). For k = @, the equation is 


e-1 
(3.11.52) (£1 — C4, )Ue = Ae—jU; — Be. 

j=0 
This time C4, — @J is not invertible. However, if (3.11.44) holds, 
(3.11.53) M(n,C) =N (Ca, — €1) © R(Ca, — ED). 


Consequently, given ee: Ay_jU; € M(n,C), we can take 
(3.11.54) Br € N(Ca, — £1) 


so that the right side of (3.11.52) belongs to R(C4, — @Z), and then we can find a 
solution Ur. We can uniquely specify Up by requiring Up € R(Ca, — ¢1), though 
that is of no great consequence. Having such By and Uz, we can proceed to solve 
(3.11.51) for each k > @. Estimates on the coefficients U; guaranteeing a positive 
radius of convergence for the power series (3.11.19) again follow by techniques 
of §3.10. We have reduced the problem of representing the general solution to 
(3.11.1) for t € (0,7) to that of representing the general solution to (3.11.45), 
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given that (3.11.54) holds. The following result accomplishes this latter task. Note 
that (3.11.54) is equivalent to 


(3.11.55) [Ao, Br] = £Br, ie., ApBe = Be(Ao + £1). 


Lemma 3.11.4. Given Ap, Be € M(n,C) satisfying (3.11.55), the general solution 
to (3.11.45) ont > 0 is given by 


(3.11.56) y(t) = t4°¢Fey, vec”. 


Proof. As mentioned earlier in this section, results of §3.10 imply that for each 
v € C”, there is a unique solution to (3.11.45) on t > 0 satisfying y(1) = v. It 
remains to show that the right side of (3.11.56) satisfies (3.11.45). Indeed, if y(t) 
is given by (3.11.56), then, for t > 0, 


dy 
t= 
dt 
Now (3.11.55) implies, for each m € N, 
AR Be = At’! Be(Ao + £1) = ++ 
= By(Ag + €1)™, 


(3.11.57) Aot*°tBeu + t4° Bet? ev. 


3.11.58 


which in turn implies 


3.11.59 40 By = Bre®(40+) = Bye eoAo, 
hence 
3.11.60 t4° By = Bet’t*. 
Therefore (3.11.57) gives 
d 
3.11.61 ‘ct = (Ap + Bet®)t4°tev, 


as desired. 


The construction involving (3.11.45)—(3.11.55) plus Lemma 3.11.4 yields the 
following. 


Proposition 3.11.5. Assume Ap € M(n,C) has the property (3.11.42) and is 
diagonalizable. Then there exist Ty > 0, U(t) as in (3.11.19), and Be € M(n,C), 
satisfying (3.11.55), such that the general solution to (3.11.1) on t € (0,7) is 


(3.11.62) a(t) = U(t)t4°tPv, ve C”. 
The following is an important property of By. 
Proposition 3.11.6. In the setting of Proposition 3.11.5, By is nilpotent. 


Proof. This follows readily from (3.11.55), which implies that for each A; € Spec Ao, 
(3.11.63) Be: GE(Ao, Aj) —> GE(Ao, Aj + 2). 
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REMARK. Note that if Bree = 0, then, for ¢t > 0, 
m 1 

(3.11.64) t=). pillost)* Be. 
k=0 


Further Bessel equation connection 


Let us apply these results to the Bessel equation (3.11.6) in case vy = nis a 
positive integer. We are hence looking at (3.11.1) when 


3.11.65 A(t) = Ap + Aot?, Ay = ( a) b, Ae 6 i 
We have 
3.11.66 Spec Ap = {n,—n}, SpecCa, = {2n,0, —2n}. 
Clearly, Ap is diagonalizable. The recursion (3.11.51) for U;, takes the form 
3.11.67 (kI — C4,)UK = Se +T x, 
where 
Xe = 0, hex; 
3.11.68 Ao, k= 2, 
AgUx_2, k> 2, 
and 
T;, = 0, k < 2n, 
(3.11.69) Bon, k = 2n, 


Up—2n Bon, k > 2n. 
In particular, the critical equation (3.11.52) is 


3.11.70 (2nI = Ca, )U2n = AgUon_2 + Bon, 
and we solve this after picking 
3.11.71 Bon € N(Ca, — 21), 


such that the right side of (3.11.70) belongs to R(C4, — 2n1I). We have from §2.7, 
Exercise 9, that (since Ag is diagonalizable) 


3.11.72 N (Ca, — 2nI) = Spanfvw' : v € E(Ao,n), w € E(Ab, —n)}. 
For Apo in (3.11.64), v is a multiple of (1,n)' and w? is a multiple of (—n, 1), so 


3.11.73 Bon = BnBE,, Bru€C, Be = (3 | 


Note that B3,, = 0. Consequently, the general solution to (3.11.1) in this case takes 
the form 


(3.11.74) x(t) = U(t)t4rB2ny, 
with 
(3.11.75) tP2n = T+ (log t) Ban. 
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Note that 
(3.11.76) N (Bon) = Span (") = E(Ao,n), 
ee Uy (;) =Upr & = U(t)t" (;) 


is a regular solution to (3.11.1). Its first component is, up to a constant multiple, the 
solution J,,(t) (in case vy = n) given in (1.16.16). The recursion gives results similar 
to (1.16.7)—(1.16.9) of Chapter 1, and U(t) has infinite radius of convergence. Note 
also that 


(3.11.78) Ao ee = -»(1\). Bn(_,) = —2nBy ("): 


which in concert with (3.11.75)—(3.11.76) gives 
(aetna ( ; ) = vee ( ; ) - 2nd (logt)U (ee (*) 
—n —n n 


= U(t)t-” (,) — 2nB,, (log t)U (t)t" (;) 


The first component gives a solution to (3.11.6), with vy = n, complementary to 
J;,(t), for ¢ > 0. Compare the formula for Y,,(¢) in (1.16.36). 


(3.11.79) 


Summary of analysis of 2 x 2 systems. When (3.11.1) is a 2 x 2 system, either 
Proposition 3.11.2 or Proposition 3.11.5 will be applicable. Indeed, if Ag € M(2,C) 
and its two eigenvalues differ by a nonzero integer @, then Ap is diagonalizable, and 


SpecCa, = {@,0, —€, 
so (3.11.42) holds. 


Further relaxed spectral conditions on C,4,, germane for n> 3 


To extend the scope of Proposition 3.11.5, let us first note that the hypothesis 
(3.11.43) that Ao is diagonalizable was used only to pass to (3.11.53), so we can 
replace this hypothesis by 


(3.11.80)  €€NNSpecCa, => M(n,C) =N (Ca, — £1) ® R(Cay — £1). 


We now show that we can drop hypothesis (3.11.42). In general, if Spec C4, N 4 0, 
we have a finite set 


(3.11.81) SpecO4, NN = {6 :1< 7 < m}; 


say £; <--: < £y. In this more general setting, we use a transformation of the 
form (3.11.20), ie., a(t) = U(t)y(t), with U(t) as in (3.11.19), to obtain for y an 
equation of the form 

dy 


(3.11.82) te 


a (Ao By, t +++ Be, t)y, 
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with commutator properties on [Ao, By,] analogous to (3.11.55) (see (3.11.85) be- 
low). In this setting, in place of (3.11.46), we aim for 
dU 


(3.11.83) = A(t)U(t) — U(t)(Ao + Bet +--+ + Bo, t'). 


We continue to have (3.11.47)—(3.11.51), with a natural replacement for T;, (cf. Ex- 
ercise 11 below). The equation (3.11.51) for k ¢ {€,...,m} is uniquely solvable 
for U;, because kI — C',, is invertible. For k = £;, 1 < 7 < m, one can pick 


(3.11.84) Be, € N(Cay — 42), 


such that the appropriate variant of (3.11.52) is solvable for Ue,, using (3.11.80). 
Note that (3.11.84) is equivalent to 


(3.11.85) [Ao, Be, ] = C;Be,, ie., AoBe, = Bo, (Ao + ;1). 


Proposition 3.11.7. Letn > 3. Assume Ap € M(n,C) has the property (3.11.80). 
For each €; as in (3.11.81), take Be, as indicated above, and set 


3.11.86) B= Be, +-:-+Bo,,.- 


Then there exist Ty > 0 and U(t) as in (3.11.19) such that the general solution to 
3.11.1) on t € (0,7) is 


3.11.87 a(t) = U(t)t4°t?v, ve Cr. 


Proof. It suffices to show that the general solution to (3.11.82) is 
3.11.88 y(t) =t4rFv, vec", 


given that the matrices By, satisfy (3.11.85). In turn, it suffices to show that if y(¢) 
is given by (3.11.88), then (3.11.82) holds. To verify this, write 


d 
(3.11.89 aT = Agt*°¢8y + t49(By, +--+ Be, to. 


Now (11.85) yields 
(3.11.90 Ab Be, = Be,(Ao + 41)", hence t4°Be, = By,tt*, 
which together with (3.11.89) yields (3.11.82), as needed. 


Parallel to Proposition 3.11.6, we have the following. 


Proposition 3.11.8. In the setting of Proposition 3.11.7, B is nilpotent. 


Proof. By (3.11.85), we have, for each A; € Spec Ao, 
(3.11.91) B:GE(Ao, Aj) —> GE(Ao, Aj + £1) + BGE(Ao, Aj + Lm), 


which readily implies nilpotence. 


Note. If B™+! = 0, then again (3.11.64) holds, with B in place of By. 
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Exercises 


1. In place of (3.11.3), consider second order equations of the form 
(3.11.92) tu’ (t) + b(t)u’(t) + c(t)u(t) = 0, 


where b(t) and c(t) have convergent power series in t for |t| < To. In such a case, 
show that setting 


(3.11.93) i ap 


yields a system of the form (3.11.1) with 


(3.11.94) A(t) = oF ane 


Contrast this with what you would get by multiplying (3.11.92) by t and using the 
formula (3.11.4) for x(t). 


2. Make note of how to extend the study of (3.11.1) to 
d: 
(3.11.95) (t- to) = A(t)a, 
when A(t) = Doy59 A(t — to)” for |t — to| < Tp. We say to is a regular singular 


point for (3.11.95) 


3. The following is known as the hypergeometric equation: 
(3.11.96) t(1— tu" (t) + [y-— (a+ 6 + 1)t]u’(t) — aBu(t) = 0. 


Show that to = 0 and to = 1 are regular singular points and construct solutions 
near these points, given a, 6, and +. 


4. The following is known as the confluent hypergeometric equation: 


(3.11.97) tu” (t) + (7 — t)u’(t) — au(t) = 0. 


Show that to = 0 is a regular singular point and construct solutions near this point, 
given a and ¥. 


5. Let B(t) be analytic in ¢ for |t| > a. We say that the equation 


dy 
11. —=B(t 
(3.11.98) = Bit)y 
has a regular singular point at infinity provided that the change of variable 
1 
ll. c(t) = y( — 
(3.11.99) x(t) =(5) 


transforms (3.11.98) to an equation with a regular singular point at ¢t = 0. Specify 
for which B(t) this happens. 
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6. Show that the hypergeometric equation (3.11.96) has a regular singular point at 
infinity. 


7. What can you say about the behavior as t \, 0 of solutions to (3.11.1) when 


21 00 0 
A(t) = 4Ao+ Ait, Ap = 1 ee 1)? 
0 0 


8. What can you say about the behavior as t \, 0 of solutions to (11.1) when 


11 0 0 0 
A(t) = Ao + Ait, Ao = 1 ». we 


1}? 
0 0 


9. In the context of Lemma 3.11.4, i.e., with Ap and By satisfying (3.11.55), show 
that 


Be and ¢27'40° commute. 


More generally, in the context of Proposition 3.11.7, with Ao and B satisfying the 
equations (3.11.85)—(3.11.86), show that 


B and 2740 commute. 
Deduce that for all t > 0 


(3.11.100) $40 27 Ao 4B p27iB = t4048 CO C= e2t Ao e2miB. 


10. In the setting of Exercise 9, pick E € M(n,C) such that 
3.11.101) ene 

cf. Appendix 3.A). Set 

3.11.102) Q(t) = t4orPt-*, t>0. 


Show that there exists m € Z* such that t™Q(t) is a polynomial in t, with coef- 
ficients in M(n,C). Deduce that, in the setting of Proposition 3.11.7, the general 
solution to (3.11.1) on t € (0,79) is 


3.11.103) x(t) =U(HQ(Ht*v, cECc", 


with U(t) as in (3.11.19), E as in (3.11.101), and Q(t) as in (3.11.102), so that 
t™ Q(t) is a polynomial in t. 


11. In the setting of (3.11.80)—(3.11.83), write down the replacement for I, adver- 
tised there. Specify the variant of (3.11.52) that one needs to solve to construct 
U(t) satisfying (3.11.83), as needed for Proposition 3.11.7. 
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3.A. Logarithms of matrices 


Given C € M(n,C), we say X € M(n,C) is a logarithm of C provided 

(3.A.1) eX =C. 

In this appendix, we aim to prove the following: 

Proposition 3.A.1. If C € M(n,C) is invertible, there exists X € M(n,C) satis- 
fying (3.4.1). 

Let us start with the case n = 1, ie., C € C. In case C is a positive real 
number, we can take X = logC, defined as in §1.1; cf. (1.1.21)-(1.1.27). More 
generally, for C € C \ 0, we can write 
(3.A.2) C=|Cle®, X =log|C|+i0. 


Note that the logarithm X of C is not uniquely defined. If X € C solves (3.A.1), 
so does X + 2mik for each k € Z. As is customary, for C € C \ 0, we will denote 
any such solution by logC. 

Let us now take an invertible C € M(n, C) with n > 1, and look for a logarithm, 
i.e., a solution to (3.4.1). Such a logarithm is easy to produce if C is diagonalizable, 
i.e., if for some invertible B € M(n,C), 


Al 
(3.A.3) B'ICB=D= 
An 
Then 
M1 
(3.A.4) Y= be » pe = log Ay => e* = D, 
Ln 
and so 
(3.4.5) eBYB' _ BDB=C. 


Similar arguments, in concert with results of §§2.7-2.8, show that to prove 
Proposition 3.A.1 it suffices to construct a logarithm of 


(3.4.6) C= XI+N), rAEC\0, N”=0. 
In turn, if we can solve for Y the equation 

(3.4.7) eY =I+4+N, 

given N nilpotent, then 

(3.A.8) fe = logrA et+Y — \(I+.N), 


so it suffices to solve (3.4.7) for Y € M(n,C). 


We will produce a solution Y in the form of a power series in N. To prepare 
for this, we first strike off on a slight tangent and produce a series solution to 


(3.4.9) eX ~1++tA, AE M(n,C), ||tAl] <1. 


Taking a cue from the power series for log(1+ t) given in (1.1.56), we establish the 
following. 
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Proposition 3.A.2. In case ||tAl| <1, (3.A.9) is solved by 


X(t) = a 


(3.4.10) = * 

t? B 

tA A? 4+ —A 

2 3 

Proof. If X(t) is given by (3.A.10), we have 
ax =A-tA?4+PAP-.-.- 
(3.4.11) eee ee 
= A(I+tA)7'. 

Hence 
(3.4.12) £ ext = 6 XO AU 4 tA)", 


for |t| <1/||Al]; cf. §3.1, Exercise 12. It follows that 


(3.A.13) L(x tA)) = eral A(I + tA)-1(I + tA) + A) =0, 
sO 

(3.4.14) e X*®(74¢A) =e *O = 7, 

which implies (3.A.9). 


The task of solving (3.4.7) and hence completing the proof of Proposition 3.A.1 
is accomplished by the following result. 


Proposition 3.A.3. If N © M(n,C) is nilpotent, then for allt € R, 


(3.A.15) eX = T+in 
is solved by 
n-1 (—1)*-1 
(3.4.16) ¥igie Se - tne, 
k=1 


Proof. If Y(t) is given by (3.4.16), we see that Y(t) is nilpotent and that eY isa 
polynomial in t. Thus both sides of (3.4.15) are polynomials in t, and Proposition 
3.A.2 implies they are equal for |t| < 1/||.N |], so (3.4.15) holds for allt € R. 


3.B. The matrix Laplace transform 

In §1.18 of Chapter 1 we defined the Laplace transform 

(3.B.1) Lf(s) = i fie “dt, Res>a, 

for a function f : [0,00) > C, integrable on [0, R] for each R < ov, and satisfying 


(3.B.2) [ | f()\le~F! dt <0, VB >a. 
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Such a transform is also well defined for 

(3.B.3) f : [0,00) — V, 

when V is a finite dimensional normed vector space, such as C", or M(n,C), and 
basic results developed in Chapter 1 continue to apply. 


Such a Laplace transform provides a tool to treat n x n first-order systems of 
differential equations, of the form 


(3.B.4) f(t) = Af(t) +9(8), £0) =», 
given 

(3.B.5) AéM(n,C), veC", g:[0,c0)>C", 
with g piecewise continuous and satisfying 

(3.B.6) lg) < Ce", for t>0. 


We seek a solution f : [0,00) + C” that is piecewise continuous and satisfies a 
similar bound. If (3.B.4) holds, f’(t) also has this property, and integration by 
parts in (3.B.1) yields 


3.B.7) Lf'(s) = s£f(s) — f(0), 

so applying L to (3.B.4) yields 

3.B.8) sLf(s)—v = ALfS(s) + Lg(s), 
or 

3.B.9) (sI — A)Lf(s) =v+ Log(s). 
Hence, for Res sufficiently large, 

3.B.10) Lf(s) = (sl — A)~(u + Lo(s)). 


In this way, solving (3.B.4) is translated to solving the recognition problem, enun- 
ciated in §1.18, i-e., finding the function f that satisfies (3.B.10). 
The approach to this introduced in §1.18 was to build up a collection of known 


Laplace transforms. Here, building on (1.18.20), we start with the matrix exponen- 
tial function, 


(3.B.11) Ea(t)=e'4, Ae M(n,C). 

We claim that 

(3.B.12) LEa(s)=(sI—A)~', for Res>a, 
provided 

(3.B.13) le] < ce, t>0. 


To see this, let L € M(n,C) and note that the identity (d/dt)e*’ = —Le~t” 
implies 


a 
(3.B.14) uf edt =I—e FT, 
0 


for each T < oo. If L satisfies 
(3.B.15) le“ || < Ce“®*, Vt >0, 
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for some 6 > 0, then we can take T —> oo in (3.B.14), and deduce that 
(3.B.16) u| e"dt=I, ie, i e dt=L"!. 
0 0 


This applies to L = sI — A as long as (3.B.13) holds and Res > a, since 
(3.B.17) et AD] = emt Rea, 
so we have (3.B.12). 
The result (3.B.12) gives 
(3.B.18) (sf — A)~lv = L(Eav)(s), 


which treats part of the right side of (3.B.10). It remains to identify the inverse 
Laplace transform of 


(3.B.19) (sI — A)~!Lg(s) = LEa(s)LG(s). 
One approach to this applies the following result, established in Proposition 1.18.2, 


in the scalar case. The extension to the matrix case is straightforward. 


Proposition 3.B.1. Let g : [0,co) > C” and B: [0,00) > M(n,C) be piecewise 
continuous and satisfy exponential bounds of the form (3.B.6). Take the convolu- 
tion, 


t 
3.B.20 Bxg(t) = | B(t—1)g(r) dr. 
0 
Then, for Res >a, 
3.B.21 L(B « g)(s) = LB(s)La(s). 
Applying Proposition 3.B.1 to (3.B.19), we have 
3.B.22 (sf — A)~!Lg(s) = L(Ea * g)(s), 
with 
t 
3.B.23 E,4* g(t) = eA g(r) dr. 
0 


Combining this with (3.B.18) and (3.B.10), we derive for the solution to (3.B.4) 
the formula 


t 
(3.B.24) f(t) = e'4u +f eA g(r) dr. 
0 

This is of course the Duhamel formula (3.4.5), derived in §3.4 by different (and 
arguably more elementary) means. One advantage of the derivation in §3.4 is that 
it does not require a global bound on the function g, of the form (3.B.6). Indeed, 
g can blow up in finite time, T, and (3.B.24) will still work for t € [0,T). Another 
advantage is that the method of §3.4 generalizes to variable coefficient systems, as 
seen in §3.9. 

On the other hand, having a collection of Laplace transforms and inverse 
Laplace transforms can be useful for computing the convolution product. Hence 
the connection between the two given by Proposition 3.B.1 is a double-edged tool. 
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3.C. Complex analytic functions 


As stated in (3.10.34), if Q Cc C is open and f : Q > C, then f is said to be complex 
differentiable at zo € 2, with derivative f’(zo), provided 


(3.C.1) im (20+ &) = f(2) 


w>0 WwW 


= f' (2). 


One also denotes the limit by df/dz. Other terms used for such functions are 
complex analytic and holomorphic. Here we sketch some results that lead to lots of 
examples of holomorphic functions. 


First, clearly f(z) = z is holomorphic, with f{(z) = 1. To go from here, we 
have the following: 


(3.C.2) f,g:Q— holomorphic C => fg holomorphic., 


where fg(z) = f(z)g(z). In fact, one can write 


f(2o + w)g(z0 + w) — f(z0)9(Z0) 


(3.C.3) be 
_ flo tw) — fle) es ive Hea g(zo + wv) = 920) | 


W W 


and take w — 0 to deduce that 


d 
(3.0.4) qt) = £9 + Fa"). 
We can apply this to fo(z) = z? = z- z to get f(z) = 2z, and, inductively, 
(3.C.5) Oop eC, neN 
C. in ,7n . 


Next, we claim that 1/z is holomorphic on C\ 0. Indeed, for zo 4 0, |w| < |zo|, 


(3.C.6) : ( : : ) = s 


Zztw x2 zo(zo + w)’ 
and taking w — 0 yields 
dl ll 
(3.C.7) sat ge z€EC\0. 


Again an inductive application of (3.C.4) yields that 1/z” is holomorphic on C \ 0, 
and 

d 1 n 
dz zm gntl? 


(3.C.8) zE€C\0, neENnN. 


We turn to the exponential function Exp(z) = e*, introduced in Chapter 1. We 
claim that this is holomorphic in C, and 


(3.0.9) —e=e*, z€C, 
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extending from R to C the formula for the derivative established there. To see this, 
use the identity et” = e*°e”, established in Chapter 1, to write 


Wo 
1 (eotw = e70) = e720 2 1 
WwW WwW 


(3.0.10) 


and note that the last sum is equal to 


(3.0.11) w=l14 free 1, as w— 0, 


yielding (3.C.9). 
We can get lots more holomorphic functions by combining the examples above 
with the following general result, known as the chain rule for holomorphic functions. 


Proposition 3.C.1. Assume 2,0 C C are open and 


3.C.12) f:2—0, g:0—-C 
are holomorphic. Then f =gof:Q—->C, defined by 
3.0.13) A(z) = (f(z), 


is holomorphic, and 


3.C.14) pI fA =FFO)FH), z€O. 


Proof. We can write the definition (3.C.1) as 
3.C.15) f (zo) = f(z) + f'(zo)wtr(z,w), 2 €Q, 


where r(zo,w)/w + 0 as w — 0 (we say r(zo,w) = o(w)). Similarly for g. Then, 
for zo € Q, 


(zo + w) = g(f (20 + w)) 
(3.0.16) = 9(f (zo) + f’(zo)w +r) 
= 9(f(20)) + 9'(f(20))(f"(20)w +17) + r1(z0,w), 
with also r1(zo, w) = o(w). Hence 
(3.0.17) h(20 + 1) = h(2) + 9! (F(20)) (20) + ra(z0,), 
with ro(zo, w) = o(w), and we have (3.C.14). 


Putting together these results yields such holomorphic functions as 
1 
2+? 
and a host of others, which the reader can play around with. 


e2/ (2741) z# ti, 


(3.0.18) 


For further information on the theory of complex analytic functions, one can 
see [47]. Chapter 7 of that text is devoted to a treatment of linear systems of ODE 
in a complex domain, extending the scope of the treatment given here in Sections 
3.10 and 3.11. 


nn 
Chapter 4 


Nonlinear systems of 
differential equations 


This final chapter brings to bear all the material presented before and pushes on 
to the heart of the subject, nonlinear systems of differential equations. Section 
4.1 begins with a demonstration of existence and uniqueness (for t close to to) of 
solutions to 


(4.0.1) ba =F(t,x), x(to) = 20. 

dt 
Here x(t) is a path in 2 C R" and F is bounded and continuous on J x 2 (with 
to € I), and satisfies a Lipschitz condition in x. (See (4.1.2) for a definition.) We 
study the issue of global existence, including positive results when F'(t, 2) is linear 
in x. Section 4.2 studies the smoothness of the solution to (4.0.1) as a function of 
Xo, given various additional hypotheses on F’, and related issues. 


Section 4.3 reveals a geometric flavor to (4.0.1), described in the language of 
vector fields and the flows they generate. A vector field on 2 C R” is a map 
F :Q—-R". This is a special case of (4.0.1), where F is independent of t. The 
path x(t) in © satisfying (4.0.1) for such F is called the orbit of F through 20; 
denote it &'(x9). This gives rise to the family of maps ®', called the flow generated 
by F. The phase portrait is introduced as a tool to understand the orbits and 
flow, from a visual perspective. We pay particular attention to how phase portraits 
look near critical points of a vector field F (which are points where F' vanishes), 
including special types known as sources, sinks, saddles, and centers. 


Section 4.4 discusses a particular class of vector fields, gradient vector fields, 
on a domain 2 Cc R”. In case n = 2, this relates to the topic of exact equations, 
discussed in many texts early on. We have broken with tradition and moved the 
discussion of exactness to here, to see it in a broader context. 


We move from generalities about nonlinear systems to settings in which they 
arise. Section 4.5 introduces a class of differential equations arising from Newton’s 
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law of motion, F = ma. This resumes the study introduced in §1.5 of Chapter 
1. This time we are studying the interaction of several bodies, each moving in 
n-dimensional space. We concentrate on central force problems. We show how a 
two-body central force problem (for motion in R”) gives rise to a second order n x n 
system, in center of mass coordinates. We look at this two-body problem in more 
detail in §4.6, and derive Newton’s epoch-making analysis of the planetary motion 
problem. 


In §4.7 we introduce another (though ultimately related) class of problems that 
lead to differential equations, namely variational problems. The general setup is to 
consider 


b 
(4.0.2) I(u) = / L(u(t), w(t) dt, 


for paths u : [a,b] > 2 C R”, given smooth L on 2 x R"”, and find conditions 
under which J has a minimum, or maximum, or more generally a stationary point 
u. We produce a differential equation known as the Lagrange equation for u. This 
method has many important ramifications. One of the most important is to pro- 
duce differential equations for physical problems, providing an alternative to the 
method discussed in §4.5. We illustrate this in $4.7 by obtaining a derivation of 
the pendulum equation, alternative to that given in §1.6. We proceed to more so- 
phisticated uses of the variational method. In §4.8 we discuss the brachistochrone 
problem, tossed about by the early leading lights of calculus, one of the founda- 
tional variational problems. In §4.9 we discuss the double pendulum, a physical 
problem that is confounding when one uses the F = ma approach, and which 
well illustrates the Lagrangian approach. An alternative to Lagrangian differential 
equations is the class of Hamiltonian differential equations. The passage from La- 
grangian to Hamiltonian equations is previewed (in special cases) in §§4.7 and 4.9, 
and developed further in §4.10. 


The majority of the systems studied in this chapter are not amenable to solution 
in terms of explicit formulas. In §4.11 we introduce a tool that has revolutionized the 
study of these equations, namely numerical approximation. Behind this revolution 
is the availability of personal computers. In §4.11 we present several techniques 
that allow for accurate approximation of solutions to (4.0.1), the most important 
being Runge-Kutta difference schemes. 


In §4.12 we return to the study of qualitative features of phase portraits, ini- 
tiated in §4.3. We define limit sets of orbits, and establish a result known as the 
Poincaré-Bendixson theorem, which provides a condition under which a limit set 
for an orbit of a planar vector field can be shown to be a closed curve, called a limit 
cycle. 


Sections 4.13-4.14 are devoted to some systems of differential equations arising 
to model the populations of interacting species. In §4.13 we study predator-prey 
equations. We study several models. In some, all the orbits are periodic, except 
for one critical point. In others, there is a limit cycle, arising via the mechanism 
examined in §4.12. In §4.14 we look at other interacting species equations, namely 
equations modeling competing species. 
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One phenomenon behind the Poincaré-Bendixson theorem is that an orbit of 
a vector field F’ in the plane locally divides the plane into two parts, one to the 
left of the orbit and one to the right. Since another orbit of F cannot cross it, 
this tends to separate the plane into pieces, in each of which the phase portrait 
has a fairly simple appearance. In dimension three and higher, this mechanism to 
enforce simplicity does not work, and far more complicated scenarios are possible. 
This leads to the occurrence of chaos for n x n systems of differential equations 
when n > 3. We explore some aspects of this in §4.15. 


This chapter ends with a number of appendices, some providing useful back- 
ground in calculus, and others taking up further topics in nonlinear systems of 
ODE. In Appendix 4.A we give basic information on the derivative of functions 
of several variables, reviewing material typically covered in third semester calculus 
and setting up notation that is used in this chapter. Appendix 4.B discusses some 
basic results about convergence, including the notion of compactness. 


In Appendix 4.C we show that if the linearization of a vector field F at a critical 
point behaves like a saddle, so does F’. Appendix 4.D takes a further look at the 
behavior of a flow near a critical point of a vector field. It produces a blown up 
phase portrait of such a flow, by taking spherical polar coordinates centered at the 
critical point. 

Appendix 4.E discusses an approximation procedure for computing the periods 
of orbits, for a certain family of planar vector fields, with reference to how Einstein’s 
correction of Newton’s equations for planetary motion yields a calculation of the 
precession of the planet’s perihelion. In Appendix 4.F we show that a spherically 
symmetric planet produces the same gravitational field as if all its mass were con- 
centrated at its center. In Appendix 4.G we prove the Brouwer fixed-point theorem 
(in dimension 2), a use of which arises in §4.15. The proof we give makes use of 
material developed in §4.4. 


In Appendix 4.H we discuss geodesic equations on surfaces. This discussion 
continues results on minima and other critical paths for the energy functional intro- 
duced in $4.7, making contact with the length functional here, which is of interest for 
the differential geometry of surfaces. Appendix 4.1 deals with rigid body motion in 
R”. We set up a Lagrangian and treat this as a variational problem. This approach 
leads to a geodesic equation on the rotation group SO(n), endowed with a certain 
left-invariant metric. We reduce this to a system of ODE for Z : I + Skew(n), with 
a quadratic nonlinearity. We specialize to n = 3 and produce Euler’s equation for 
the free motion of a rigid body in R*, and show this is solvable in terms of elliptic 
integrals. 


4.1. Existence and uniqueness of solutions 


We investigate existence and uniqueness of solutions to a first order nonlinear n x n 
system of differential equations, 


(4.1.1) o =F(t,x), 2x(to) = 20. 


We assume F’ is bounded and continuous on I x 2, where J is an open interval 
about tp and ( is an open subset of R”, containing xp. We also assume F' satisfies 
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a Lipschitz condition in x: 
(4.1.2) F@.2) — F(t,y)ll < Lle—9ll, 


for allt € I, x,y € Q, with L € (0,00). Such an estimate holds if Q is convex and 
F is C' in « and satisfies 


(4.1.3) ||DeF(t,2)|| < L, 


for allt € J, x € Q. At this point, the reader might want to review the concept 
of the derivative of a function of n variables, by looking in Appendix 4.A. The 
implication (4.1.3) = (4.1.2) follows readily from (4.4.9). Our first goal is to prove 
the following. 


Proposition 4.1.1. Assume F : I x Q —> R” is bounded and continuous and 
satisfies the Lipschitz condition (4.1.2), and let x9 € Q. Then there exists Ty > 0 
and a unique C! solution to (4.1.1) for |t — to] < To. 


The first step in proving this is to rewrite (4.1.1) as an integal equation: 


t 
(4.1.4) x(t) = x0 +f F(s,2x(s)) ds. 

to 
The equivalence of (4.1.1) and (4.1.4) follows from the fundamental theorem of 
calculus. It suffices to find a continuous solution x to (4.1.4) on [to — To, to + To], 
since then the right side of (4.1.4) will be Ct in t. 


We will apply a technique known as Picard iteration to construct a solution to 
(4.1.4). We set xo(t) = xo and then define «,,(¢) inductively by 


t 
(4.1.5) In4i(t) = Lo + / F(s,%n(s)) ds. 
to 
We show that this converges uniformly to a solution to (4.1.4), for |t — to| < To, if 
To is taken small enough. To get this, we quantify some hypotheses made above. 
We assume 


(4.1.6) Br(xo) = {x € R”: ||x — xol| < RE CQ 
and 
(4.1.7) ||F(s,2)|| <M, VseETI, « € Br(ao). 


Clearly, xo(t) = xo takes values in Br(a) for all t. Suppose that z,,(t) has been 


constructed, taking values in Br(ao), and x,,+1(t) is defined by (4.1.5). We have 


t 
(4.1.8) Iltn41(t) — aol] S i |F(s,2n(s))|| ds < Mt — tol, 
to 
SO Ln+41(t) also takes values in Br(xo) provided |t — to| < Tp and 
R 
4.1. To << —. 
(4.1.9) ee 


As long as (4.1.9) holds and [tp — To, to + To] C I, we get an infinite sequence 2, (t) 
of functions, related by (4.1.5). 
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We produce one more constraint on To, which will guarantee convergence. Note 
that, for n > 1, 


llr (t) — 2n(2)l| = | [RG 2nl6)) ~ Fls.20-6)) ds| 
(4.1.10) < ‘ \|F'(s, tn(s)) — F(@, @n-1(s))|| ds 


t 
< J len(s) ~2»-a(has 
to 


the last inequality by (4.1.2). Hence 


4.1.11 ‘ t) — ¢pn(t)|| < LT a — In~-1(s)||- 
may, llena() —an()| SL max, |jtn(s) ~ 2n(s)l 


The additional constraint we impose on To is 


4.1.12 Ti i< - a € (0,1). 

Noting that 

4.1.13 max ||x,(t) — xol| < R, 
t—to|<To 


we see that 


4.1.14 4 t) — a, (t)|| < a" R. 
\t—tol<To l2@n41(t) — en (t)|| < @ 


Consequently, the infinite series 


(4.1.15) a(t) = a9 + > (an41(t) — en(t)) 
n=0 


is absolutely and uniformly convergent for |t — to| < Tp, with a continuous sum, 
satisfying 


(4.1.16) max ||a(t) — an(t)|| < a"! R. 


|t—to|<To 


It readily follows that 


t t 
(4.1.17) - F'(s,@n(s)) ds — f F(s,x(s)) ds, 
to to 
so (4.1.4) follows from (4.1.5) in the limit n > oo. 
To finish the proof of Proposition 4.1.1, we establish uniqueness. Suppose y(t) 
also satisfies (4.1.4) for |t — to| < Zo. Then 
t 
le) — voll =|] f [F@.2(8)) - Fs.u(s))] as| 
to 
t 
(4.1.18) < / ||E'(s, x(s)) — F(s, y(s))|| ds 
to 


<b f ols) — u(s)llas 
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and hence 


(4.1.19) |e(t) — y(t)|| < ToL max, |jx(s) — y(s)]). 


max | 
|t—-to|<To |s—to|<To 


As long as (4.1.12) holds, ToL < a < 1, so (4.1.19) clearly implies max),_4)<7 ||2(#)— 
y(t)|| = 0, which gives the asserted uniqueness. 


Note that the Lipschitz hypothesis (4.1.2) was needed only for x,y € Br(ao). 
Thus we can extend Proposition 4.1.1 to the following setting: 
For each closed, bounded K Cc Q, there exists Lx < co such that 
(4220) (2) — Fall < Lalla — yl, Vey eK, tL. 
We can also replace the bound on F’ by 
For each K as above, there exists Mx < oo such that 
||F(t,x)|| < Mx, Vee K, tel. 


Results of Appendix 4.B imply that there exists Rx > 0 such that 


4.1.21 


4.1.22 K= U Br,(£) is a compact subset of 2. 
cek 


It follows that for each xo € K, the solution to (4.1.1) exists on the interval 
4.1.23 {tel: |t — to| < min(R«/Mz,a/Lz)}. 


Now that we have local solutions to (4.1.1), it is of interest to investigate when 

global solutions exist. Here is an example of breakdown: 
di 

(4.1.24) a =z", 2(0)=1. 
Here J =R, n=1, Q=R, and F(z) = 2? is smooth, satisfying the local bounds 
(4.1.21)—(4.1.23). The equation (4.1.24) has the unique solution 
3 Sl 
~1-t! 
which blows up as t 1. It is useful to know that blowing up is the only way a 
solution can fail to exist globally. We have the following result. 


(4.1.25) x(t) t € (—oo, 1), 


Proposition 4.1.2. Let F be as in Proposition 4.1.1, but with the Lipschitz and 
boundedness hypotheses relaxed to (4.1.20)-(4.1.21). Assume [a,b] is contained in 
the open interval I and assume x(t) solves (4.1.1) for t € (a,b). Assume there 
exists a closed, bounded set K CQ such that x(t) € K for allt € (a,b). Then there 
exist ay <a and b; > b such that x(t) solves (4.1.1) for t € (a1, 61). 


Proof. We deduce from (4.1.23) that there exists 6 > 0 such that for each x € 
K, t, € [a,b], the solution to 

d: 
(4.1.26) = =F(t,x), x(t) =21 
exists on the interval [t; — 6,t1 + 6]. Now, under the current hypotheses, take 
t; € (b— 6/2,b), x1 = ax(ti), with x(t) solving (4.1.1). Then solving (4.1.26) 
continues x(t) past t = b. Similarly one can continue x(t) past t = a. 
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Here is an example of a global existence result that can be deduced from Propo- 
sition 4.1.2. Consider the 2 x 2 system for x = (y, v): 


(4.1.27) dt 


Here we have Q = R?, F(t, x) = F(t, y,v) = (v, —y?). If (4.1.27) holds for t € (a,b), 
we have 


(4.1.28) 


d eC t du dy 0, 


1 oey _ 
a\2 ta) at! a 
so each x(t) = (y(t), v(t)) solving (4.1.27) lies in a level curve y*/4 + v?/2 = C, 
hence is confined to a closed, bounded subset of R?, yielding global existence of 
solutions to (4.1.27). 


We can also apply Proposition 4.1.2 to establish global existence of solutions 
to linear systems, 


d. 
(4.1.29) _ =A(t)z, 2(0)=20, 
given A(t) continuous in t € J (an interval about 0), with values in M(n,C). It 
suffices to establish the following. 


Proposition 4.1.3. [f||A(t)|| < & fort € I, then the solution to (4.1.29) satisfies 


(4.1.30) IIx(t)|| < eX" |x|]. 


Proof. It suffices to prove (4.1.30) for t > 0. Then y(t) = e~**a(t) satisfies 


d 
= Cy, y(0) = 20, 


with C(t) = A(t) — K. Hence C(t) satisfies 
(4.1.32) Re(C(t)u,u) <0, Vue C”. 


(4.1.31) 


Then (4.1.30) is a consequence of the following estimate, of interest in its own 
right. 


Lemma 4.1.4. If y(t) solves (4.1.31) and (4.1.32) holds for C(t), then 
(4.1.33) ly) <Ily(0)|| for #0. 


Proof. We have 


(4.1.34) = 2Re(C(t)y(t), y(t)) 
< 


which gives (4.1.33). 
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Thanks to Proposition 4.1.3, we have for s,t € J, the solution operator for 
4.1.29), 


4.1.35 S(t,s)€ M(n,C), S(t,s)x(s) = a(t), 
introduced in §3.8 of Chapter 3. As noted there, we have the Duhamel formula 


t 
4.1.36 x(t) = S(t, to) +/ S(t, s) f(s) ds, 
to 
for the solution to 
d. 
4.1.37 = =A(t)a+ f(t), x(to) = 20. 


If F(t, a) depends explicitly on t, we call (4.1.1) a nonautonomous system. If F 
does not depend explicitly on t, we say (4.1.1) is autonomous. The following device 
converts a nonautonomous system to an autonomous one. Take the n x n system 
(4.1.1). Then the (n+ 1) x (n+ 1) system 


dx dy 


(4.1.38) dt = F(y,2), dt =1, x(to) = Xo, y(to) = to 
has the autonomous form 

dz 
(4.1.39) He 7 G(), 20) = (ao, to), 


for z = (x,y), with G(z) = (F(y,2),1), and the solution to (4.1.38) is (x(t),?), 
where x(t) solves (4.1.1). Thus for many purposes it suffices to consider autonomous 
sytems. 

To close this section, we note how a higher order n x n system, such as 
d*e (k-1) (k-1) 
qe (bee ); x(to) = 2o,---,2 (to) = 2-1, 
can be converted to a first order nk x nk system, for 


(4.1.40) 


Yo 
(4.1.41) y= : » yt) ER”, O< jf <k-1. 
Yk-1 

The system is 

ayo _ 

dt Y1; 
4.1.42 
( ) dyp—2 - 

di Yk-1s 

dyp—1 
dt = F(t, yo,---,Yk-1), 


with initial data 

(4.1.43) yj(to) =2;, O<j<k-1. 
If y(t) solves (4.1.42)—(4.1.43), then x(t) = yo( 

(4.1.44) tD(t)=y,(t), O<j7<k-1. 
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Note how this construction is parallel to that done in the linear case in §3.3. 


eC 
Exercises 


1. Apply the Picard iteration method to 
d. 
a =azr, x(0)=1, 
given a € C. Taking xo(t) = 1, show that 


2. Discuss the matrix analogue of Exercise 1. 


3. Consider the initial value problem 
di 
a =a, 2«(0)=1. 

Take xp = 1 and use the Picard iteration method (4.1.5) to write out 
Enlt)y WH 1,2; 8: 


Compare the results with the formula (4.1.25). 


4. Given Ap, Ai € M(n,C), consider the initial value problem 
dx 
=z =(Ajp+Ait)z, 2(0) = 20. 
Take xo(t) = zo and use the Picard iteration (4.1.5) to write out 
Ln(t), n=1,2,3. 


Compare and contrast the results with calculations from §3.10 of Chapter 3. 


5. Let x,(t) be an approximate solution to (4.1.1), and assume that 
|x(t) — tn (€)|| < bnlt— tol”, for te I. 


Let %,+41(t) be defined by (4.1.5), and assume the Lipschitz condition (4.1.2) holds. 


Show that i 
\|a(t) — tn4i(t)|| < aaa tol|"t*, tel. 


6. Modify the system (4.1.27) to 


dy du 3 
Sage z= 7y -v 
dt  ° dt # 
Show that solutions satisfy 
d sv2 y* 
Safe ae BE log 
mare 
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and use this to establish global existence for t > 0. 


7. Consider the initial value problem 
dx 
dt 

Note that z(t) = 0 is a solution, and 


=|z|"7,  2(0) =0. 


1 
x(t) = ria t>0, 
0, t<0 


is another solution, on t € (—oo, co). Why does this not contradict the uniqueness 
part of Proposition 4.1.1? Can you produce other solutions to this initial value 
problem? 


8. Take 6 € (0,00) and consider the initial value problem 


o =a, x(0) =1. 


Show that this has a solution for all t > 0 if and only if 8 < 1. 


9. Let F : R" > R” be C! and suppose x(t) solves 


dx 

4.1.45 rie F(a), x(to) =20, 

for t € J, an open interval containing to. Show that, for t € J, 
d 

4.1.46 “al a(t)||? = 2a(t) - F(a(t)). 


Show that, if a > 0 and x(t) £0, 


d a _ a-— 
4.1.47 Tlle(é)|I* = alle(t)|¢-2a(t) - F(a(t). 


10. In the setting of Exercise 9, suppose F satisfies an estimate 
4.1.48 |F(x)|| <CO+4+]|zl)®, VveeR", C<o, B<1. 
Show that there exists a > 0 and K < oo such that, if ||a(¢)|| > 1 for t € J, 


d 
|< iS . 
Sle@I<K, veer 


Use this to establish that the solution to (4.1.45) exists for all t € R. 
Gronwall’s inequality and consequences 


Exercises 11-13 below will extend the conclusion of Exercise 10 to the case 3 = 1 in 
(4.1.48). One approach is via the following result, known as Gronwall’s inequality. 


Proposition 4.1.5. Assume 
(4.1.49) geC(R), g D0. 
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Let u and v be real valued, continuous functions on I satisfying 


ult) < A+ | glu(s))as. 
(4.1.50) ir 

v(t) >A +f g(v(s)) ds. 
Then 
(4.1.51) u(t) < v(t), for te I, t >to. 
Proof. Set w(t) = u(t) — v(t). Then 


(4.1.52) , 
=| M(s)w(s) ds 
where 
1 
4.1.53 M(s) = | g (ru(s) + (1 — 7)0(s)) dr 


4.1.54 


and we 


4.1.55 


Hence we have 


w(t) < if M(s)w(s)ds, M(s)>0, MeC(J), 


0 
claim this implies 


w(t) <0, VtEl, t>to. 


In other words, we claim that w(t) < 0 on [to, 6] whenever [to, b] c I. To see this, 
let t; be the largest number in [¢o, 6] with the property that w < 0 on [to, ti]. We 


claim t 


hat ty = b. 


Assume to the contrary that t; < b. Noting that A M(s)w(s)ds < 0, we 


deduce 
4.1.56 


Hence, 


4.1.57 


4.1.58 


4.1.59 


contra 


cation (4.1.54) = (4.1.55), completing the proof of Proposition 4.1.5. 


from (4.1.54) that 


t 
w(t) < / M(s)w(s)ds, Wt € [t1, 6). 
t1 
with 
Kk = max M(s) <o, 


[t1,0] 


we have, for a € (t1,b), 


Tete w(t) < (a-—t)K te w(s). 
ti,a ty,a 


If we pick a € (t,, 6) such that (a — t,)K < 1, this implies 


w(t) <0, Vte [tal], 


icting the maximality of t;. Hence actually t; = b, and we have the impli- 
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11. Assume v > 0 is a C! function on J = (a,b), satisfying 


du 
4.1.60 a < Cv, v(to) = v0, 
where C’' € (0,00) and to € I. Using Proposition 4.1.5, show that 
4.1.61 v(t) < eC yo, Vt E [to, b). 


12. In the setting of Exercise 11, avoid use of Proposition 4.1.5 as follows. Write 
4.1.60) as 


d 
4.1.62 77 =Cv—g(t), v(to)=v0, g>9, 
with solution 
t 
4.1.63 v(t) = eC) yg — if eC (8) g(s) ds. 
to 


Deduce (4.1.61) from this. 


13. Return to the setting of Exercise 9, and replace the hypothesis (4.1.48) by 
(4.1.64) |F(2)|| < CO4+ |x|), VeeR” 


Show that the solution to (4.1.45) exists for all t € R. 
Hint. Take v(t) = 1+ ||x(¢)||? and use (4.1.46). Show that Exercise 11 (or 12) 
applies. 


4.2. Dependence of solutions on initial data and other parameters 


We study how the solution to a system of differential equations 
(4.2.1) —=Fi(z), x(0)=y 


depends on the initial condition y. As shown in §4.1, there is no loss of generality in 
considering the autonomous system (4.2.1). We will assume F’ : 2. > R” is smooth, 
Q Cc R” open and convex, and denote the solution to (4.2.1) by = a(t, y). We want 
to examine smoothness in y. Let DF (x) denote the n x n matrix valued function 
of partial derivatives of F’. (See Appendix 4.A for more on this derivative.) 

To start, we assume F is of class C!, i.e., DF is continuous on Q, and we want 
to show z(t,y) is differentiable in y. Let us recall what this means. Take y € 2 
and pick R > 0 such that Br(y), defined as in (4.1.6), is contained in 2. We seek 
ann xX n matrix W(t,y) such that, for wo € R”, ||wol| < R, 


(4.2.2) x(t, yt wo) = a(t, y) a Wt, y)wo she r(t, Y, wo); 
where 
(4.2.3) r(t,y, Wo) = o(||woll), 
which means 
t 
(4.2.4) lim TY, 0) _ 9 


wo) ||wo|| 
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When this holds, x(t, y) is differentiable in y, and 


(4.2.5) Dyx(t,y) = W(t,y). 
In other words, 
(4.2.6) x(t,y + wo) = a(t,y) + Dyz(t,y)wo + o(|lwoll) 


In the course of proving this differentiability, we also want to produce an equa- 
tion for W(t,y) = Dyx(t,y). This can be done as follows. Suppose «x(t, y) were 
differentiable in y. (We do not yet know that it is, but that is okay.) Then F'(x(t, y)) 
is differentiable in y, so we can apply D, to (4.2.1). Using the chain rule, we get 
the following equation, 


d 
OT = DEW, Wy) =1. 


called the linearization of (4.2.1). Here, I is the nx n identity matrix. Equivalently, 
given wo € R”, 
(4.2.8) w(t,y) = W(t, y)wo 


is expected to solve 


(4.2.7) 


d 
(4.2.9) a =DF(z)w, w(0) = wo. 
Now, we do not yet know that x(t, y) is differentiable, but we do know from results 
of §4.1 that (4.2.7) and (4.2.9) are uniquely solvable. It remains to show that, with 


such a choice of W(t, y), (4.2.2)—(4.2.3) hold. 
To rephrase the task, set 
4.2.10 a(t)=a(t,y), wl(t)=r(t,yt+wo), z(t) = x1(t) — x(t), 
and let w(t) solve (4.2.9). The task of verifying (4.2.2)—(4.2.3) is equivalent to the 
task of verifying 
4.2.11 I|2(¢) — w(t) || = o(||woll). 
To show this, we will obtain for z(t) an equation similar to (4.2.9). To begin, 
4.2.10) implies 
dz 


4.2.12 aE =F(x1)— F(x), 2(0) = wo. 
Now the fundamental theorem of calculus gives 
4.2.13 F(a) — F(a) = G(a1,x)(x1 — 2), 
with 
1 
4.2.14 G(«#1,2) = i: DF (ray + (1—7)a) dr. 
0 
If F is C1, then G is continuous. Then (4.2.12)—(4.2.13) yield 
d 
4.2.15 = =G(x1,2)z, 2z(0) = wo. 


Given that 
4.2.16 |DF(w||<L, Vue, 
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which we have by continuity of DF, after possibly shrinking 2 slightly, we deduce 
from Proposition 4.1.3 that 


(4.2.17) Ilz@)|| < el wo, 
that is, 
(4.2.18) lIn(t,y) —2(t,y + wo)]| < e”4 Iwo. 


This establishes that x(t, y) is Lipschitz in y. 


To proceed, since G is continuous and G(x, x) = DF(«), we can rewrite (4.2.15) 


as 
4.2.19 “ = G(ex+2,x2)z = DF(x)z+ R(z,z), 2(0) = wo, 
where 
4.2.20 F € C9) = |R(e, 2)|| = o((lell) = olwoll)- 
Now comparing (4.2.19) with (4.2.9), we have 
4.2.21 “( —w) = DF(x)(z-w)+ R(z,z), (2-w)(0) =0. 
Then Duhamel’s formula gives 
4.2.22 z(t) — w(t) = Ve S(t, s)R(a(s), z(s)) ds, 
0 


where S(t, s) is the solution operator for d/dt — B(t), with B(t) = DF(a(t)), which 
as in (4.2.17), satisfies 


(4.2.23) |S(t, 8)|| < el’ slF, 
We hence have (4.2.11), i-e., 
(4.2.24) \|2(t) — w(E)|| = o(|!woll)- 


This is precisely what is required to show that x(t, y) is differentiable with respect 
to y, with derivative W = Dy«(t, y) satisfying (4.2.7). Hence we have the following. 


Proposition 4.2.1. If F € C1(Q) and if solutions to (4.2.1) exist fort € (—Tp,T1), 
then, for each sucht, x(t,y) isC' in y, with derivative Dyx(t,y) satisfying (4.2.7). 


We have shown that x(t,y) is both Lipschitz and differentiable in y. The 
continuity of W(t, y) in y follows easily by comparing the differential equations of 
the form (4.2.7) for W(t, y) and W(t, y + wo), in the spirit of the analysis of z(t) 
done above. 

If F' possesses further smoothness, we can establish higher differentiability of 
the function x(t, y) in y by the following trick. Couple (4.2.1) and (4.2.7), to get a 
system of differential equations for (z,W), 


dx 


(4.2.25) a am 
“dt. ng (x) ’ 


with initial conditions 


(4.2.26) 2(0)=y, W(0) =I. 
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We can reiterate the preceding argument, getting results on D,(x,W), hence on 
the matrix of second derivatives Di xtt, y), and continue, proving the following. 


Proposition 4.2.2. If F € C*(Q), then x(t,y) is C* in y. 


Similarly, we can consider dependence of the solution to 
dx 
dt 
on a parameter 7, assuming F' smooth jointly in (7,7). This result can be deduced 
from the previous one by the following trick. Consider the system 


(4.2.27) =F(r,x), x(0)=y 


(4.2.28) — = Fi(z,x), e =0, 2(0)=y, 2(0) =T. 


Then we get smoothness of x(t, 7, y) jointly in (7, y). As a special case, let F'(7, 2) = 
TF (ax). In this case r(to,T,y) = x(tTto, y), so we can improve the conclusion in 
Proposition 4.2.2 to the following: 


(4.2.29) Feck(Q)— «eC jointly in (t,y). 


~~ 5] 
Exercises 


1. Suppose 7 € R in (4.2.27). Show that € = 0x/0r satisfies 


dé (6) 
ot D,F (7, x)& + anh x), €(0)=0. 
2. Consider the family of differential equations for x-(t), 
d: 
a =a+re2, «(0)=1. 
Write down the differential equations satisfied by € = Ox/07 and by n = 0?x/0r?. 


3. Let c= 2,(t), y = yr (t) solve 
Loar 2,2 dy _ _ = 
eT Ut Te ty), Ge =e £(0)=1, (0) =0. 
Knowing smooth dependence on 7, find differential equations for the coefficients 
X;(t), Y;(£) in power series expansions 

ay(t) = Xo(t) + 7Xi(t) +77 Xo(t)+---, 
yr (t) = Yo(t) + T¥i(t) + 77¥a(t) +>. 


Note that Xo(t) = cost, Yo(t) = sint. 


(4.2.30) 


(4.2.31) 


4. Using the substitution €(t) = —a(-—t), n(t) = y(—t), show that, for 7 sufficiently 
small, solutions to (4.2.30) are periodic in t. 


5. Let p(r) denote the period of the solution to (4.2.30). Using (4.2.31), show that 
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p(T) is smooth in 7 for |7| small. Note that p(0) = 27. Compute p’(0). Compare 
results in Appendix 4.E. 


6. Suppose y in (4.2.1) is a critical point of F, ie., F(y) = 0. Show that (4.2.7) 
becomes 


ay = LW, W(0) =I, where L= DF(y), 


hence 
F(y) =0 => Dyz(t,y) =e. 


4.3. Vector fields, orbits, and flows 


Let 2 C R” be an open set. A vector field on 2 is simply a map 
(4.3.1) F:Q—>R", 


such as encountered in (4.2.1). We say F is a C* vector field if F is a C* map. A 
C®™ vector field is said to be smooth. By convention, if we simply call F' a vector 
field, we mean it is a smooth vector field. In this section we always assume F' is at 
least Cl. 

One can also look at time-dependent vector fields (cf. (4.1.1)), but in this section 
we restrict attention to the autonomous case. 

The solution to (4.2.1), ie., to 
d. 

4.3.2 a =F(x), «(0)=y, 
will be denoted 
4.3.3 x(t) = B1,(y). 


Results of §$4.1—4.2 imply that for each closed bounded K Cc Q there exists an 
interval I = (—Tp,T\) about 0 such that, for each t € J, 


4.3.4 6: K —Q, 


and this is a C* map if F is a C* vector field. The family of maps 61, from K to 
Q is called the flow generated by F’. We have 


4.3.5 ©2.(y) =y, 
i.e., ®9, is the identity map. We also have 
4.3.6 OF" (y) = Of o Op (y), 


provided all these maps are well defined. Given y € 2, the path 
4.3.7 tH G4E(y) 


is called the orbit through y. 
Another way to state the defining property of ®4, is that (4.3.5) holds and 
d 


(4.3.8) prada) = F(®4(a)). 


We next obtain interesting information on the t-derivative of 


(4.3.9) v'(x) = o(®p (2), 
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given v € C4(Q), ie., v is of class C1 and vanishes outside some closed bounded 
Kk CQ. The chain rule (cf. Appendix 4.A, especially (4.4.8)) plus (4.3.8) yields 


(4.3.10) Soha) = F(®4,(2))- Vu(®4(z)). 
In particular, 

d z _ 
(4.3.11) Ge Pe (@)) a F(a) + Vov(«). 


Here Vu is the gradient of v, given by Vu = (0v/0x1,...,0v/Oxn)'. A useful 
alternative formula to (4.3.10) is 


d , GD 33g 
=v (4) = —v' (OF (4 
(4.3.12) ae ae es 
= F(x) - Vv'(2), 
the first equality following from (4.3.6) and the second from (4.3.11), with v replaced 
by v!. 
One significant consequence of (4.3.12), which will lead to the important result 
(4.3.24) below, is that, for v € C}(Q), 
d 
a v(®4,(x)) dx = [Fro -Vol(ax) dx 
(4.3.13) > . 
=— / div F(x) v(®%,(z)) da. 
Q 
Here div F(x) is the divergence of the vector field F(x) = (Fi(x),...,Fn(x))', 
defined by 


OF OF», 
vt 


(4.3.14) div F(a) = Bai (xz) +--+ Bae (x). 
The last equality in (4.3.13) follows by integration by parts, 
Out OF; t 
(4.3.15) froze dz =— Ba (x) dx, 
Q 2 


followed by summation over k. We reiterate the content of (4.3.13): 


d 
(4.3.16) oe / v(®',(x)) dx = — | civ F(a) v(®(a)) dx. 
Q Q 
So far, we have (4.3.16) for v € C4(Q). We can extend this by noting that (4.3.16) 
implies 
[e@r@ dx — fo dx 
(4.3.17) 


Q 
_ f [ ererantey ao 


Basic results on the integral allow one to pass from v € Cd (Q) in (4.3.17) to more 
general v, including v = yg (the characteristic function of B, defined to be equal 
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to 1 on B and 0 on 2 \ B), for smoothly bounded closed B C 2, amongst other 
functions. 


In more detail, if B C Q is a smoothly bounded, closed set, let Bs = {x € R": 
dist(x,B) < 5}. There exists d9 > 0 such that Bs C Q for 6 € (0, 60]. For such 6, 
one can produce vs € Cd(Q) such that 


4.3.18 v3 =1 on B, O<vus <1, vs =0 on R"\ Bz. 
Then 


4.3.19 | fae) dx — [oo de| < vol(Bs \B) > 0, as 6-50, 
0, 


so, as 6 > 
4.3.20 [vste) dz — [wo dx. 
Q Q 
Similar arguments give 
(4.3.21) [s@r@) dz — | xa@p(o) ae, 
Q Q 
and 


(4.3.22) | [ew F(x) vs(®%,(x)) dx ds —> | [ew F(x) xp(®}(2x)) dz ds. 
Q 2 


These results allow one to take v = xg in (4.3.17). 


Now one can pass from (4.3.17) back to (4.3.16), via the fundamental theorem 
of calculus. Note that 


(4.3.23) Vol 64,(B) = [xol@e'@) da. 


We can apply (4.3.16) with t replaced by —t, and v by yg, and deduce the following. 


Proposition 4.3.1. If F is aC! vector field, generating the flow ®',, well defined 
onQ forte I, and B CQ is smoothly bounded, then, for t € I, 


(4.3.24) © Vol ®(B) = i div F(x) dx. 


o7,(B) 


This result is behind the notation div F’, i.e., the divergence of F’. Vector fields 
F with positive divergence generate flows ©4, that magnify volumes as t increases, 
while vector fields with negative divergence generate flows that shrink volumes as 
t increases. 


We say the flow generated by a vector field F is complete provided ©4,(y) is 
defined for all t € R, y € 9. We say it is forward complete if @4(y) is defined 
for all t € [0,00), y € Q. The flow is backward complete if ®4(y) is defined 
for all t € (—o0,0], y € 2. Here is an occasionally useful criterion for forward 
completeness. 
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Proposition 4.3.2. Let F be a C! vector field on Q = R”. Assume there exists 
R< oo and a function V € C1(R") such that 


(4.3.25) V(x) > +00 as |lx|| > co 
and 
(4.3.26) |z|| > R= VV(a)- F(z) <0. 


Then the flow ©, is forward complete. 


Proof. Let x(t) = ®(ao) be an orbit, defined for t € I, some interval about 0. 
Then 


d 
(4.3.27) Je = R= Sve) = Va) Fle) <0. 
Hence, for t € J, t > 0, x(t) is confined to the closed bounded set 


(4.3.28) {« € R": V(z) < maxV(y), y € Br(0)U {xo}}. 


From here, Proposition 4.1.2 yields forward completeness. 


One way to display the behavior of the flow generated by a vector field F on a 
domain 2 is to draw a phase portrait. This consists of graphs of selected integral 
curves of F’, with arrows indicating the direction of F’ along each integral curve. 
Such portraits are particularly revealing when dim? = 2, and also of considerable 
use when dim = 3. As an example, consider Figure 4.3.1, the phase portrait of 
the flow associated to the 2 x 2 system 


do 
(4.3.29) a 
ld =— sing 
dt e , 
which arises from the pendulum equation (cf. Chapter 1, (1.6.9)) 
Og. 
(4.3.30) qe + 7 sind = 0, 


by adding the variable 7 = d@/dt. Here g,@ > 0. The system (4.3.29) has the form 
(4.3.2) with « = (0,7) and 


(4.3.31) F(0.) = ( zy 


oe 
¢sin@ 


Note that Figure 4.3.1 looks like Figure 1.6.2 of Chapter 1, except that here we 
have added arrows, to indicate the direction of the flow. As noted in Chapter 1, 
the orbits of this flow are level curves of the function 

bP ge 
(4.3.32) E(0,~) = Seay, cos 6, 


since if (A(t), w(t)) solves (4.3.29), 


(4.3.33) £€(0.¥) = + ; (sin 0)6’ = 0. 
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a Oe ae 


Figure 4.3.1. Pendulum phase plane 


It is instructive to expand on this last calculation. In general, if (6’, ~’) = F(6,¥), 


ae 


(4.3.34) — £e(6,y) = VE(8,W)- (GW), where VE(0,¥) = os 


dt 
Now the formula (4.3.31) gives 
(4.3.35) F(0,w) = -JVE(0,v), 


where 


(4.3.36) J= G 0) ; 


so the vanishing of dé (6, w) /dt follows from (4.3.34)—(4.3.35) and the skew-symmetry 
of J, which implies 


(4.3.37) v- Jv=0, VuveR?. 
A vector field of the form (4.3.35) is a special case of a Hamiltonian vector field, a 


class of vector fields that will be discussed further in §§4.5, 4.7, and 4.10. 


We mention some noteworthy features of the phase portrait in Figure 4.3.1, 
features to look for in other such portraits. First, there are the critical points of F’, 
i.e., the points where F’ vanishes. In case (4.3.31), the set of critical points is 


{(k7,0):k € Z}. 


Figure 4.3.1 indicates different natures of the orbits near these critical points, de- 
pending on whether & is even or odd. For k even, the orbits near (k7,0) consist of 
closed curves. We say these critical points are centers; cf. Figure 4.3.2. 
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Figure 4.3.2. Center 


Figure 4.3.3. Saddle 


For k odd, the orbits near p = (k7,0) consist of curves of the following nature: 
(a) two orbits that approach p as t + +00, 

(4.3.38) (b) two orbits that approach p as t + —oo, 
(c) orbits that miss p, looking like saddles. 


We say these critical points are saddles. See Figure 4.3.3. Sometimes one calls 
them hyperbolic critical points. 
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Considerable insight is obtained from the study of the linearization of F' at 
each critical point. Generally, if F is a C! vector field on Q C R”, rp € Q, and 
F'(xo) = 0, the linearization of F' at xo is given by 


(4.3.39) L = DF(ao) € £(R"). 


This construction extends the notion of linearization given in §1.8 of Chapter 1. 
We expect that 


(4.3.40) O5(a0 + y) © to + e'%y, 


for ||y|| small. See Exercise 6 of §4.2 (but mind the change in notation). Going 
further, we expect some important qualitative features of the flow © near xo to 
be captured by the behavior of e’, and this is borne out, with some exceptions. 
If DF (ao) has zero as an eigenvalue (we say Zo is a degenerate critical point), this 
approximation is not typically useful. It has a better chance if det DF (xo) # 0. 
We then say xo is a nondegenerate critical point for F’. 


In case F’ is given by (4.3.31), with critical points at p, = (ka,0), we have 


(4.3.41) Lo = DF(0,0) = (“, qj] : 


The eigenvalues of this matrix are +i,/g/@, and the orbits of e*”° are ellipses, with 
qualitative features like Figure 4.3.2, a center. Meanwhile, 


(4.3.42) Ly = DF(+n,0) = G : 


The eigenvalues of this matrix are +,/g/f, with corresponding eigenvectors given 


by (1, +,/9/@)*, and the orbit structure for e'”! has qualitative features like Figure 
4.3.3, a saddle. 


In general, if F' is a planar vector field with a nondegenerate critical point at xo, 
and if all the eigenvalues of DF (xo) are purely imaginary, F' itself might not have 
a center at Xo, ie., the orbits of F near xq might not be closed orbits surrounding 
xo. Here is an example. Take 


(4.3.43) F(x) = Jax —|\a||?2, 2 € R?, 
with J as in (4.3.36). Then zp = 0 is a critical point, and DF(0) = J. Thus the 
linearization has a center. However, if x(t) is an orbit for this vector field, then 
d 
Sle (|l? = 20-2! 
= -2l|2\I*, 


ie., p(t) = ||x(t)||? satisfies 

dp 2 
4.3.45 — =-29". 
( ) a p 
This is separable and we have 


= pees ‘ 
(4.3.46) p(0) = po p(t) T+ Qip 0 as t 7 +00, 
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so the orbits of this vector field spiral into the origin as t 7 +00, though much 
more slowly than they do in the case of spiral sinks, a type of critical point that we 
will encounter shortly. 


Despite the existence of such examples as (4.3.43), the fact that (0,0) is a center 
for F, given by (4.3.31), is no accident, but rather a consequence of the fact that 
F has the form (4.3.35), 


(4.3.47) F(x) = —JVE(x), 


so that, as derived in (4.3.34)—(4.3.37), orbits of F’ lie on level curves of €. Generally, 
if € is asmooth real-valued function on a planar domain 2 C R? and the vector field 
F is given by (4.3.47), (nondegenerate) critical points of F and (nondegenerate) 
critical points of € coincide. If x9 € Q is such a point 


(4.3.48) DF(ao) = —JD?E(x0), 

where D?€(xo) is the matrix of second order partial derivatives of € at x9, i.e., 
OPE /00? ee) 

OPE/D00W = OPE/AYW? )° 

We recall the following result, established in basic multivariable calculus. Let xo 


be a nondegenerate critical point of €, so D?€(aq) is an invertible, real symmetric 
matrix. Then 


(4.3.49) D?E= ( 


D°E(xo) positive definite <= € has a local minimum at xo, 
(4.3.50) D*E(xq) negative definite = € has a local maximum at 29, 
D*€(xo) indefinite = € has a saddle at zo. 
We also note that, whenever A € M(2,R) is symmetric and invertible, 
A positive definite <= det A >0 and Tr A > 0, 
(4.3.51) A negative definite <= det A>0 and TrA <0, 
A indefinite = det A < 0. 


Furthermore, if A is such a matrix and 


(4.3.52) B=—JA, 
then 
(4.3.53) det B = det A, 


and, for such B € M(2,R), 


ih eel B has 2 real eigenvalues of opposite signs <= det B < 0, 
8) B has 2 purely imaginary eigenvalues = det B > 0 and Tr B =0, 
Putting these observations together (cf. also Exercise 8 below), we have the follow- 
ing. 

Proposition 4.3.3. Let € be a smooth function on Q C R?, with a nondegenerate 
critical point at xo. Let F be given by (4.3.47). Then 


DF (xo) has 2 purely imaginary eigenvalues 
(4.3.55) (zo) purely imaginary eig 


& E has a local maz or local min at xo, 
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Figure 4.3.4. Damped pendulum phase plane 


and 
DF(xo) has 2 real eigenvalues of opposite sign 


eek © E has a saddle at xo. 


We move on to the 2 x 2 system 


dé 
(4.3.57) ae 
Py dp oa  —Isind 
i om eo 
which arises from the damped pendulum equation (cf. Chapter 1, (1.7.6)), 
d?6 ad@O gq. 
(4.3.58) we Yaa ey sin# = 0, 


by adding the variable 7) = d0/dt. Here g, £,a,m > 0. The system (4.3.57) has the 
form (4.3.2) with « = (0,7) and 


= y 
(4.3.59) F(0,w) = ee —$sin0} 
The phase portrait for this system is illustrated in Figure 4.3.4. We compare and 
contrast this portrait with that depicted in Figure 4.3.1. 


To start, the vector field (4.3.59) has the same critical points as the field given 
by (4.3.31), namely {(k7,0) : & € Z}. The first striking difference is in the behavior 
near the critical points (ka,0) with & even. Figure 4.3.4 depicts orbits spiraling 
into these critical points, as opposed to the picture in Figure 4.3.1 of closed orbits 
circling these critical points. Let us consider the linearizations about these critical 
points. For F' as in (4.3.59), we have 


(4.3.60) L= DF(0,0) = (4, a ; 
Lom 
with characteristic polynomial \(A + a/m) + g/£, hence with eigenvalues 
a a2 4g 


(4.3.61) Nes 
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Figure 4.3.5. Sinks 


Y 


Figure 4.3.6. Sources 


There are three cases: 


Cask I. a?/m? < 4g/é. Then Ax are complex conjugates, each with real part 
—a/2m. 


CASE II. a?/m? = 4g/¢. Then A} = A~ = —a/2m. 
CASE III. a?/m? > 4g/¢. Then A, and \_ are distinct real numbers, each negative. 


In all three cases, we have e'4v + 0 as t 7 +00, for each v € R?. In Case I, 
there is also spiraling, and the orbits look like those in Figure 4.3.5(a). Figure 4.3.4 
depicts such behavior. In Case III, the orbits look like those in Figure 4.3.5(c). In 
Case II, the orbits look like a cross between Figure 4.3.5(b) and Figure 4.3.5(c). 
These critical points are all called sinks. (Reverse the sign on F’, and the associ- 
ated orbits are called sources; cf. Figure 4.3.6.) The three cases described above 
correspond to damped oscillatory, critically damped, and overdamped motion, as 
discussed in §1.9. 

Further information on the nature of these orbits spiraling in toward these 
sinks can be obtained from a computation of the rate of change along the orbits of 
E(0,w), given by (4.3.32), ie., 


2 
(4.3.62) £(0,u) = . = 7003 8. 
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This time, instead of (4.3.33), we have 


d apef te Ges / 
© £(0,0) = ou + Lsino)o 
(4.3.63) (= 4 sin) 4 (sin Oy 


a 
=-—y’. 
m 


While this calculation applies nicely to the problem at hand, it is useful to note the 
following general phenomenon. 


Proposition 4.3.4. Let F' be a smooth vector field on Q C R”, with a critical point 
at 9 EQ. Assume 


(4.3.64) all the eigenvalues of DF (xo) have negative real part. 
Then there exists 6 > 0 such that 


(4.3.65) lz — roll < 6 ==> _jim t.x = 20. 


To prove this, we bring in the following linear algebra result. 


Lemma 4.3.5. Let L € M(n,R) and assume all the eigenvalues of L have real 
part <0. Then there exists a (symmetric, positive definite) inner product ( , ) on 
R” and a positive constant K such that 


(4.3.66) (Lu,v) < —K(v,v), Vu eR”. 


We show how Lemma 4.3.5 allows us to prove Proposition 4.3.4. Apply this 
lemma to L = DF(xo). Note that there exist a,b € (0,00) such that 


(4.3.67) allull? < (u,v) < blol?, Vo ER", 
where (v,v) is as in (4.3.66) and, as usual, ||v|/? = uv-v. Since F is smooth, 
(4.3.68) F(xo + y) = Ly + Rly), 
with R smooth on a ball about 0 and DR(0) = 0. Hence 
(4.3.69) |R(y)I| < Cllyll? < C'(y, 9). 
For y(t) = ®4,(o + yo) — 0, we have 

< (y(t). y(t) = 20y'(t),9(0) 
eet) = 2(F(xo + y),¥) 

= 2(Ly,y) + 2(R(y),9)- 


Now (4.3.66) applies to the first term in the last line of (4.3.70), while Cauchy’s 
inequality plus (4.3.69) yields 
(R(y), 9) < (RY), RO)? yy)? 


(4.3.71) 
< Cly, yy”. 


4.3. Vector fields, orbits, and flows 239 


Hence 


© lus) $—2K (yy) + Cly.9)? 
< —-Kly, Y); 


the last inequality holding provided (y,y)!/? < K/C. As long as 6 in (4.3.65) is 
small enough that {x € R” : ||z — xo|| < 6} is contained in Q and |lv|| < 6 > 
(u,v)? < K/C, if « = ao + yo and |lyol| < 6, then (4.3.72) holds for y(t) = 
©1,(xo + yo) — Zo for all t > 0, and yields 


(4.3.73) (y(t), y(t) Se" (yo, Yo), 
which in turn gives (4.3.65). 


(4.3.72) 


We now prove Lemma 4.3.5. As shown in §2.8, C” has a basis {v1,...,Un} with 
respect to which L is upper triangular, i.e., 


(4.3.74) Lo; = Ajvj + D> ajnve- 
k<j 

Alternatively, Appendix 2.B of Chapter 2 shows that C” has an orthonormal basis 
{v;} for which (4.3.74) holds. The eigenvalues of L are ;, so by hypothesis there 
exists Ky € (0,00) such that ReA; < —Kj for all j. Now if we take ¢ > 0 and set 
w; = eJu;, we get 
(4.3.75) Lw; = Ajwy + ys et Fa, we. 

k<j 


Then setting 


(4.3.76) Os a0}, > butwe ) = Re So 458; 


defines a positive definite inner product (depending on « > 0) on C”, hence by 
restriction on R”, and if ¢ > 0 is taken sufficiently small, the desired conclusion 
(4.3.66) follows, with K = K,/2, from (4.3.75). 


See the exercises for another proof of Lemma 4.3.5. 


Having discussed the critical points of the vector field (4.3.59) at (0,0) and 
related issues, we now consider the critical points at (--7,0). We have 


0 1 
(4.3.77) DF(+7,0) = G aa) : 
This matrix has eigenvalues 
a a2 4g 
4.3.78 + : 
( ) 3 2m m? L? 


one positive and one negative. These critical points are saddles. The orbits near 
these critical points have a behavior such as described in (4.3.38). Unlike the case 
of F’ given by (4.3.31), where the orbits are level curves of €, the proof of this is 
more subtle in the present situation. See Appendix 4.C for a proof. 
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Having studied the various critical points depicted in Figures 4.3.1 and 4.3.4, 
we point out some special orbits that appear in these phase portraits, namely orbits 
connecting two critical points. Generally, if F is a C! vector field on Q C R” with 
critical points pi, po € 2, an orbit x(t) of 4 satisfying 


(4.3.79) jim x(t) =pi, jim. x(t) = po 


is called a heteroclinic orbit, from p, to po, if pi # pe. If pi = pe, such an orbit 
is called a homoclinic orbit. In Figure 4.3.1, we see heteroclinic orbits connecting 
pi = (—7,0) and po = (7,0), one from p; to pz and one from pz to p;. These lie 
on level curves where €(6,w) = g/@. 


Such a heteroclinic orbit describes the motion of a pendulum that is heading 
towards pointing vertically upward. As time goes on, the pendulum ascends more 
and more slowly, never quite reaching the vertical position. With a little less energy, 
the pendulum would stop a bit short of vertical and fall back, swinging back and 
forth. With a little more energy, the pendulum would swing past the vertical 
position. Recall that Figure 4.3.1 portrays the motion of an idealized pendulum, 
without friction. The motion of a pendulum with friction is portrayed in Figure 
4.3.4. 


In Figure 4.3.4, we see a heteroclinic orbit from (—7, 0) to (0,0), another from 
(—7,0) to (—27,0), another from (7,0) to (0,0), another from (7,0) to (27,0), 
etc. Given that there is an orbit x(t) = (@(¢), W(t)) here such that limy_,.. x(t) = 
(—1,0) and ~(t) > 0 for large negative t, the fact that lim;.+.. z(t) = (0,0) can 
be deduced from (4.3.63), ie., 


(4.3.80) —E(0,w) = ee 


We end this section with a look at the phase portrait for one more vector field, 
namely 


Ig 
4.3.81 F(6,v) = € a 
See Figure 4.3.7. In this case, 
a _ (0E/00 
4.3.82 F=VEé= Cone 
with € given by (4.3.32), ie., 
2 
4.3.83 E(0,) = . 2 7 cos 0. 


Such a vector field is called a gradient vector field, and its flow ®% is called a 
gradient flow. Note that if F(a) = VE(a) and z(t) is an orbit of ®4, then 

d 

ae (alt) = VE(a) - VE(a(t)) = IIVE(a(t)) |’. 

The critical points of F’ again consist of {(ka,0) : k € Z}, and again they behave 
differently for even k than for odd k. This time 


(4.3.84) 


(4.3.85) DF(0,0) = G ) ; 
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w 
6 
—T T 
Y 
Figure 4.3.7. Gradient vector field, VE(6, ~) 


which is positive definite. The origin is a (non-spiraling) source; cf. Figure 4.3.6. 
In particular, if x(t) = (A(t), ~(£)) is an orbit and x(0) is close to (0,0), then 


(4.3.86) lim x(t) = (0,0). 


t+—oco 


This can be deduced from Proposition 4.3.4 by reversing time. It also follows 
directly from (4.3.84). For k odd, we have saddles: 


(4.3.87) DF(+r,0) = Co )) ; 


In this case, segments of the real axis provide heteroclinic orbits, from (0,0) to 
(—7,0), from (0,0) to (7,0), ete. 


Note that an orbit for F = VE satisfies 


(4.3.88) 6’ = isin 6, W=y, 
so ~(t) = Ae’, and, for 0 away from {km : k € Z}, 

9, do do 
(209) et a / sind / cos(@ — 1/2)’ 


an integral that can be evaluated via results of Exercise 14 in §1.1. 
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a a | 
Exercises 


1. If F generates the flow © and v'(x) = v(®4(z)), show that 


(4.3.90) Dt, (2) F(x) = F(®4(2)) 
and 
(4.3.91) Dv' (x) = Dv(64,(x)) D4 (2). 


Relate these identities to the simultaneous validity of (4.3.10) and (4.3.12). 


Hint. To get (4.3.90), use 
d d é 
(4.3.92) Her) = a5 er o &%,(z) nae D&',(x) F(a), 


and compare (4.3.8). 


2. Extend Proposition 4.3.2 as follows. Replace hypothesis (4.3.26) by 
VV(a)- F(a) < K, VareR", 


for some K < oo. Show that the flow ®4 is forward complete. 


3. Let Q = R” and assume F is a C! vector field on Q. Show that if 
|F(x)|| < CU + (Ie), 


then the flow generated by F is complete. 
(Hint. Recall Exercise 12 of §4.1.) 
Show that the flow is forward complete if 


F(a)-@ < C(1+ [lall?). 


4. Let 2 C R” be open and let F be a C! vector field on Q. Let U C Q be an open 
set whose closure U is a compact subset of 2, and whose boundary OU is smooth. 
Let n: OU — R” denote the outward pointing unit normal to OU. Assume 


(4.3.93) n(x): F(x) <0, Va eu. 


Show that ®4(ax) € U if x € U and t > 0, and deduce that ©%, is forward complete 
on U, and also on U. 


5. In the setting of Exercise 4, relax the hypothesis (4.3.93) to 
(4.3.94) n(x): F(x) <0, Va eu. 


Show that ®4(x) € U if € U and t > 0, and deduce that ©4, is forward complete 
on U. 

Hint. Take a C! vector field X on Q such that X-n <0 on OU. Set F, = F+7X 
to produce a smooth family F, of C! vector fields on 2 such that Fy = F and, for 
0<7<1, F, has the property given in (4.3.93). Then make use of Exercise 4 and 
of results of §4.2. 
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6. In the setting of Exercise 5, replace the hypothesis (4.3.94) by 
(4.3.95) n(x): F(x) =0, Va eu. 
Show that 4, is complete on U, and that 64,() € OU whenever x € OU andt € R. 


7. Show that if Fis given by (4.3.31), then 
div F = 0, 
while if Fis given by (4.3.59), then 
divF =——, 
m 
and if F’ is given by (4.3.81), then 
div F = 7 cos 8+ a 


8. Let A € M(2,R), and take J as in (4.3.36). Show that if A is positive definite 
then A = P? with P positive definite. Show that 


—JA and — PJP are similar, 
and deduce that 
A€ M(2,R) positive definite =» B = —JA has 2 purely imaginary eigenvalues. 
Relate this to Proposition 4.3.3. 


9. Consider the system 

dx _ dy _ 
de ap ~ 
Take E(x,y) = y?/2 + 23/3 — x. Show that if (a(t), y(t)) solves (4.3.96), then 
dE(x(t), y(t)) = 0. Show that the associated vector field has two critical points, 
one a center and the other a saddle. Sketch level curves of E and put in arrows 
to show the phase space portrait of F. Show that there is a homoclinic orbit 
connecting the saddle to itself. 


(4.3.96) 1-2. 


10. Find all the critical points of each of the following vector fields, and specify 
whether each one is a source, sink, saddle, or center. 


sin x cos sin x cos 
Fay) = ( ’). Gen = ( "). 


cos x sin y —cosxsiny 


11. Returning to the context of Exercise 1, show that (4.2.2) gives 


d 
(4.3.97) qo ert) = DF(®4(x))D®L(x), D&%(x) =I. 
Recall from (3.8.6)—(3.8.10) of Chapter 3 that, for an n x n matrix function M(t), 


“M() = A(t)M(t) => “ det M(t) = (Tr A(t)) det M(t). 
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Deduce that 

d 

(4.3.98) 7 Wet DO f(x) = Tr DF (®}(2)) det D&}(2) 
= div F(®}(2)) det D®;,(2)). 


Relate this to (4.3.13), using the change of variable formula 


(4.3.99) fo dx = [ero det D&®',(x) dx. 


12. Use (3.8.10) of Chapter 3 to conclude from (4.3.98) that 


(4.3.100) det D®4,(x) = eof [’ div F(®%(z)) ds}. 


13. Let U CQ C R™ be asmoothly bounded domain. The divergence theorem says 
that if F is a C1 vector field on Q, 


(4.3.101) [ew F(x) dx = i n(x): F(x) dS(x), 
U au 


where n(x) is the outward pointing unit normal to OU and dS(z) is (n — 1)- 
dimensional surface area on OU (arc length if n = 2). Given this identity, we 
see that, in the setting of Proposition 4.3.1, (4.3.24) is equivalent to 


(4.3.102) “vol 61,(B) = ii n(x) + F(x) dS(a). 
1, (B) 


Show that this holds if and only if for each smoothly bounded U c Q, 


(4.3.103) Vol Pe (W)| = [r@-Fo) dS(x). 


OU 
Try to provide a direct demonstration of (4.3.103) (at least for n = 2). 


Exercises 14—16 lead to another proof of Lemma 4.3.5. 


14. Take L € M(n,R). Recall from §2.7 of Chapter 2 that C” has a basis of 
generalized eigenvectors for L and if v € GE(L, A), then e*“v has the form 


e 
ely =e t*un, vE EC”. 
k=0 


Use these facts to show that if Re A < 0 for each eigenvalue 2 of L, then there exist 
C,K € (0,00) such that 


(4.3.104) leA || < Ce“*®*, Vt>0. 


15. Assume L € M(n,R) and Re < 0 for each eigenvalue \ of L. Let (v,w) = v-w 
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denote the standard Euclidean inner product on R”. Show that (v, w), given by 


(v, w) = (ev, ew) dt 
0 


is a well-defined, symmetric, positive-definite inner product on R”. 
Hint. Use (4.3.104) to show that the integral is absolutely convergent. 


16. In the setting of Exercise 15, show that, for v € R”, 


and deduce that 


d 
qe e*'v)| 9 = (v,v) 
On the other hand, show that also 
d 
qe ery)| 6 = (Lv, v) + (v, Lv) = 2(Lv, v), 
i = 


and obtain another proof of Lemma 4.3.5. 


4.4. Gradient vector fields 

As mentioned in §4.3, a vector field F’ on an open subset Q C R” is a gradient 
vector field provided there exists u € C1(Q) such that 

(4.4.1) F=Vu, 


ie, F = (Fi,...,F,)' with F, = Ou/Oz,. It is of interest to characterize which 
vector fields are gradient fields. Here is one necessary condition. Suppose u € C?(Q) 
and (4.4.1) holds. Then 


OF, 0 Ou 
4.4.2) Gu On ose 
and 
4.4.3) 0 Ou O Ou 


so if (4.4.1) holds then 


4.4.4) =—! Vj,ke {1,...,n}. 
Lk 


We will establish the following converse. 


Proposition 4.4.1. Assume Q C R” is a connected open set satisfying the condi- 
tion (4.4.13) given below. Let F be a C' vector field on Q. If (4.4.4) holds on Q, 
then there exists u € C?(Q) such that (4.4.1) holds. 


We will construct wu as a line integral. Namely, fix p € 9, and for each x € 
let y be a smooth path from p to z: 


(4.4.5) y: [0,1] — 2, 7(0) =p, yA) =2. 
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We propose that, under the hypotheses of Proposition 4.4.1, we can take 


(4.4.6) u(a“) = [Fo - dy. 
y 
Here the line integral is defined by 


(4.4.7) [Fa-a= | F(v(t)) +(e. 
0 


For this to work, we need to know that (4.4.6) is independent of the choice of such 
a path. A key step to getting this is to consider a smooth 1-parameter family of 


paths +7, from p to z: 
(4.4.8) y(t) = 7(s,t), y: [0,1] x [0,1] — 9, 
_ (8,0) =p, (8,1) = 2. 


Lemma 4.4.2. If F is aC? vector field satisfying (4.4.4) and ys is a smooth family 
satisfying (4.4.8), then 


(4.4.9) [ro -dy is independent of s € [0,1]. 
Ys 


Proof. We compute the s-derivative of this family of line integrals, i.e., of 


1 ay 
i. F(y(s,t)) - yao t) dt 


1 f,) : 
=| YFilolsst) Flo 


The s-derivative of the integrand is obtained via the product rule and the chain 
rule. We obtain 


d 


(4.4.10) 


(4.4.11) " 


We can apply the identity 


a0 ( y-22 9 
as Ob) ~ Bt Gs I 


to the second integrand on the right side of (4.4.11) and then integrate by parts. 
This involves applying 0/0t to F;(7(s,t)), and hence another application of the 
chain rule. When this is done, the second integral on the right side of (4.4.11) 
becomes 


1 
(4.4.12) [ ae (y 5,1) Saels.t) o(s,t) dt. 


Now if we interchange the roles of 7 and k in (4.4.12), we cancel the first integral 
on the right side of (4.4.11), provided (4.4.4) holds. This proves the lemma. 
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Given 2 Cc R” open and connected, we say 2 is simply connected provided it 
has the following property: 


Given p,xz € Q, if yo and 7 are smooth paths from p to a, 


(nt3) they are connected by a smooth family y, of paths from p to x. 


Here is a class of such domains. 


Lemma 4.4.3. [fQC R” is an open conver domain, then Q is simply connected. 


Proof. If 9 is convex, two paths yo and 7, from p € 2 to x € 2 are connected by 
(4.4.14) y(t) = (L—s)y0(t) + sui(t), O<s <1, 
so (4.4.13) holds. 


Of course there are many other simply connected domains, as the reader is 
invited to explore. 


Now that we have Lemma 4.4.2, under the hypotheses of Proposition 4.4.1 we 
simply write 


(4.4.15) u(x) = < F(y)- dy. 
P 


Note that if q is another point in Q, we can take a smooth path from p to x, passing 
through q, and write 


(4.4.16) u(a) = [ Fly) + dy + ie Fy): dy. 
Pp q 


Again using the path independence, we see we can independently choose paths from 
p to q and from q to x in (4.4.16); these paths need not match up smoothly at q. 

We are now in a position to complete the proof of Proposition 4.4.1. Take 
5 >0so that {y € R”: ||z — yl] < 6} CQ. Take k € {1,...,n}, fix qq such that 
lqx — | < 6, and write 


(Vr yk ys@n) © 
(4.4.17) u(x) = / F(y)- dy +f F(y)- dy. 
Pp (Layee yes Ln) 
Here the intermediate point is obtained by replacing x, in = (%71,...,2n) by qx. 
The first term on the right side of (4.4.17) is independent of xz, so 
Ou 0 f* 
ale) = 5 | Fy) dy 
Oxy, Ox, (2456+ 5Bn) 
LE 
eae) — m, Fry(@1,---,€h-1, 8, e415 +++;2n) ds 
Orr qk 
= F(z), 


the last identity by the fundamental theorem of calculus. This proves Proposition 
44.1. 
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An example of a domain that is not simply connected is the punctured plane 
R? \ 0. Consider on this domain the vector field 


JZ 0 -1 
4.4.19 F(a) = =(h ae 


with components 


—2% XY 
4.4.2 hija" Raye, 
: Or are BO ara 
We have 
Fi 22-22 OF 
4.4.21 oe eae Oe oa R?\ 0. 


Ox2 ||x||4 Ox, 


However, F is not a gradient vector field on R?\0. Up to an additive constant, the 
only candidate for u in (4.4.1) is the angular coordinate @, 


(4.4.22) F(x) = V0(z), 


and this identity is true on any region Q formed by removing from R? a ray starting 
from the origin. However, @ cannot be defined as a smooth, single valued function 
on R? \ 0. 


Let us linger on the case n = 2 and make contact with the concept of exact 
equations. Consider a 2 x 2 system 


dx dy 
4.4.2 —= » a= fo(x,y). 
(4.4.23) a fi(z,y), oF f2(z,y) 
We take (a, y) € 2 C R? and assume fj € C1(Q). This system turns into a single 
differential equation for y as a function of a, 


(4.4.24) dy _ fa(x,y) 


dx fi(z,y)’ 
which we rewrite as 
g(x,y) da + go(x,y) dy = 0, 
g(x,y) = folx.y), g2(@,y) =—fila,y). 
The equation (4.4.25) is called exact if there exists u € C?(Q) such that 
Ou Ou 
n= On’ g2 = ay 
If there is such a wu, solutions to (4.4.24) or (4.4.25) are given by 
(4.4.27) u(z,y) =C. 


Now (4.4.26) is the condition that G = (g1,g2)’ be a gradient vector field on 2. 
Note that the relation between F = (fi, fo)’ and G = (g1, 92)’, with components 
given by (4.4.25), is 


(4.4.28) G=-ZJF, 


(4.4.25) 


(4.4.26) 


where 


(4.4.29) i (; iv : 
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As we have seen, when 2 is simply connected, (4.4.26) holds for some wu if and only 
if 
0 o) 
(4.4.30) a = a 
Note that this is equivalent to 
(4.4.31) div F = 0. 
REMARK. If F = (Fi, Fo, F3)! is a vector field on Q C R3, its curl is defined as 
cuwlF=VxF 

i j k; 
=det | 0/dx 0/dy 0/dz 


(4.4.32) poem aes 
OF; OFo\. | (OF, _ OF) . OF, OF, 
=(5 ee. er a (oe ay 
We see that 
(4.4.33) (4.4.4) holds => curl F = 0. 


We conclude with some remarks on how to construct u(x), satisfying 
Ou 


(4.4.34) ia, ) 


=Fi(z), 1<j<n, 


given the compatibility conditions (4.4.4), without evaluating line integrals. We 
start with 


0 nm 
(4.4.35) Un(x) = [reo dx, so - = F(a). 
rn 
Then O(u — un)/Oxy, = 0, so 
(4.4.36) u(z) = uq(n) + v(2"), 2! = (@1,-.-%-1)- 
It remains to find v, a function of fewer variables. It must solve 
(4.4.37) ee ee ee ea 


dx; Nae 


Note that the left side is independent of x,,, which requires that the right side have 
this property. To check this, we calculate 


a) Ou OF; O Ou 
F;(« j i 
bak (2) aed Oty OL n OX; 
(4.4.38) — 9Fn 9 dun 
Ox; Ox; OX 
=0, 
the second identity by (4.4.4) (and (4.4.3)). Thus (4.4.37) takes the form 
O 


<* =Gj(2'), 1<j<n-1, 


(4.4.39) On, 
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with G;(a’) = F(x) — Ou, /Ox;. Note that, for 1 < j,k <n—1, 


OG; OF; 0 Oun 
Oxy, Ox,  Oxp OX; 


(4.4.40) — OFe _ _O On 
Ox; Ox; OxrE 

_ Gy 

=e 


so the task of solving (4.4.39) is just like that in (4.4.34), but with one fewer variable. 
An iteration yields the solution to (4.4.34). 


EXAMPLE. Take 
4.4.41 F(ax,y,2) = (y,2 + 27, 2yz)*. 


One readily verifies (4.4.4), or equivalently that curl fF’ = 0. Here (4.4.35) gives 


4.4.42 u3(x,Yy, Zz) = pew dz = yz’, 
so 
4.4.43 u(x, y, 2) = y2? + v(a,y). 


Next, requiring u/Oy = « + z? means 


Ov 


4.4.44 = 
Oy ve 
so 
4.4.45 v(z,y) = ry + w(2). 


Then, requiring 0u/Ox = y means Ow/Ox = 0, so we get 


4.4.46 u(x, y,Z) = y2? + ry, 


as the unique function on R? such that Vu = F, up to an additive constant. 


One can turn the method given by (4.4.35)-(4.4.40) into an alternative proof 
of Proposition 4.4.1, at least if Q is an n-dimensional box. The reader is invited 
to look into what happens when this method is applied to F given on R? \ 0 by 
(4.4.19). 
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Exercises 


For (1)—(4), identify which vector fields are gradient fields. If the field is a gradient 
field Vu, find u. 


1 
2 
3 
4 


YZ, LZ, LY), 
LY, YZ, 22), 
2x,2,Y), 
2x, y, Z). 


(1) ( 
(2) ( 
(3) ( 
(4) ( 


For (5)—(8), identify which equations are exact. If the equation is exact, write down 
the solution, in implicit form (4.4.27). 
(2x + y) dx + «dy = 0, 
xda + (2% + y) dy =0, 
dx +ady =0, 
e¥ dx + xe¥ dy = 0. 


Given f(x,y) dx + g(x,y) dy, a function u(zx,y) is called an integrating factor if 
uf dx + ug dy is exact. For example, e” is an integrating factor for dx + xdy. Find 
integrating factors for the left sides of (9)—(12), and use them to find solutions, in 
implicit form. 
(9) (a? + y? —1) dx — 2xy dy = 0, 
(10) xy? dx + «(1 +?) dy =0, 
(11) ydx + (2x — ye”) dy =0, 
(12) dx + 2xy dy = 0. 


13. Establish the following variant of Lemma 4.4.2: 
Lemma 4.4.2A. If F is aC vector field on Q satisfying (4.4.4) andy, is a smooth 
family satisfying 
y(t) =7(s,t), y: [0,1 x [0,1 32, (5,0) = 7(s,1), 
then 


[ro - dy is independent of s € [0, 1]. 


Ys 
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In Chapter 1 we saw how Newton’s law F’ = ma leads to a second order differential 
equation for the motion on a line of a single particle, acted on by a force. Newton’s 
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laws also apply to a system of m interacting particles, moving in n-dimensional 
space, to give a second order system of the form 


xp, 
(4.5.1) mea = 2 Fy(tp—aj), 1<k<m. 
{j:jAk} 
Each «;, takes values in R", so 7 = (#1,...,%m) takes values in R’”. Here xx is 


the location of a particle of mass m;,. The law that each action produces an equal 
and opposite reaction translates to 


(4.5.2) Fyr(&r = xj) = —Fyj (a; a Lk). 


A particularly important class of forces Fy; (xz — xj) are those parallel (or antipar- 
allel) to the line from 2; to xp, 


4.5.3 Fix (@e — @5) = fyx(llee — @5l|) (we — 25). 
In such a case, (4.5.2) is equivalent to 
4.5.4 Silt) = fag (vr). 


A force field of the form (4.5.3) is a gradient vector field, 
Fix (\lull)u = —VVjn(u), 


4.5.5 
Vie(u) = vjx(llull),  Uje(r) = —rfjx(r)- 
If (4.5.4) holds, 
4.5.6 Vin (u) = Veg (u)- 
The total energy of this system of interacting particles is 
1 dx, ||? 1 
4.5.7 B= 53 om| = | +5 Valen 29) 


The first sum is the total kinetic energy and the second sum is the total potential 
energy. The following calculations yield conservation of energy. First, 


dE dx, dx, 1 dx, dx; 
(45.8) = Lome ga ge tg Vel 2) (Ge) 
k j#k 
Next, (4.5.1) implies that the first sum on the right side of (4.5.8) is equal to 
dxp, 
(4.5.9) S- Finan — 2;)- aa 


a#k 


and (4.5.3)—(4.5.5) imply that the second sum on the right side of (4.5.8) is equal 
to 


1 dz, dx; dxp 
(4.5.10) “3 Fale xj) ( = 1) Xian — 25) a 


Comparing (4.5.9) and (4.5.10), we have energy conservation, 
dE 


(4.5.11) =o: 
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We can convert the second order system (4.5.1) for mn variables into a first 
order system for 2mn variables. One way would be to introduce the velocities 
Up = x},, but we get a better mathematical structure by instead using the momenta, 


d: 
(4.5.12) Pk =m, l<k<m. 
We can express the energy E in (4.5.7) as a function of position x = (x1,...,%m) 
and momentum p = (p1,..-,Pm), 
1 1 
(4.5.13) E(0,p) = So lbw? + 5 Viale — 2). 
k j#k 


Recall that x, = (re1,.--,2kn) € R” and pp = (pei,---;Dkn) € R". We have 


OE 1 
(4.5.14) Mpa. 
Pre Mk 
and 
E fi 
(4.5.15) a - OU Cg J 
Mi gee 
invoking (4.5.6). Let us write (4.5.14)—(4.5.15) in vector form, 
OE 1 OE 
4.5.1 = ss 7 
(4.5.16) en ae Da > VVjx(@k — 25), 


{5:5 #K} 
where 0E'/Op, = (OE/Opx1,-.., OE /Oppn)*, etc. Now the system (4.5.1) yields the 
first order system 


dxp, 1 dpr 
St ae > Fix (re — 25), 
{j:jAk} 


(4.5.17) 


which in turn, given (4.5.3)-(4.5.5), gives 


dx, OE dpr OE 
4.5.1 = 4 =— . 
( ? 8) dt OpK dt Orr 


The system (4.5.18) is said to be in Hamiltonian form. 

We can place the study of Hamiltonian equations in a more general framework, 
as follows. Let R? have points (#,p), « = (#1,...,eK), p = (p1,---,pK). Let 
Q c R** be open and assume FE € C1(Q). Consider the system 


dz, OF 
dt Op.’ 
4.5.1 
iat em 
dt —— Oxy,’ 


for 1 <k< K. This is called a Hamiltonian system. It is of the form 


x 


4 ec 


(4.5.20) = 
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where Xz is a vector field on 2, called the Hamiltonian vector field associated to E. 
In this general setting, E is constant on each solution curve (x(t), p(t)) of (4.5.19). 
Indeed, in such a case, 


d OE dz, OE dpr 
BECP) = Dae att Lape oe 


ke OPK 
(4.5.21) dE OE dE OE 
ee » Ox, OD » Op, OX 


Returning to the setting (4.5.1)—(4.5.2), we next discuss the conservation of the 
total momentum, 


dxrp, 

4.5.22 Pi = ; . 
(4.5.22) DP > Me 
Indeed, 

dP xp, 

dt y mee 
(4.5.23) = > Fyn (ae _ x) 

J#k 
=0, 


the last identity by (4.5.2). Thus, for each solution x(t) to (4.5.1), there exist 
a,b € R” such that 


1 
(4.5.24) ML metelt) =a+bt, M= De 
The left side is the center of mass of the system of interacting particles. The vectors 
a,b € R” are given by the initial data for (4.5.1), 


1 1 7 
4.5.25 a= MMH) b= mL mare(0) 
Given this, we can obtain a system similar to (4.5.1) for the variables 
4.5.26 y(t) = xx (t) a (a + bt). 
We have yj = xf and yx — yj = pe — @j, So (4.5.1) gives 
a yk 
4.5.27 mea = SY) Fye(ye-—yj)) 1<k<m. 
{7:3 Ak} 

In this case we have the identity 

4.5.28 So meye(t) = 0, 

k 


as a consequence of (4.5.24). We can use this to reduce the size of (4.5.27), from a 
system of mn equations to a system of (m — 1)n equations, by substituting 
m1 


1 
4.5.29 im = —-— 
( ) y Fan > Meye 
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into (4.5.27), for 1 <k <m-—1. One calls (y1,...,Ym) center of mass coordinates. 
In case m = 2, this substitution works out quite nicely. We have 


m1 


(4.5.30) y2=-—YVi, 
m2 
and the system (4.5.27) reduces to 
Py, my 
(4.5.31) my = Fu((1 mt): 


the equation of motion of a single particle in an external force field. Alternatively, 
for © = 21 — 22 =y1 — y2 = (1 +m /m2)y1, 


mim, dx 


4.5.32 ——— 
( ) Mm, +mMo dt? 


=> F(a). 


For m > 2, the resulting equations are not so neat. For example, for m = 3, 
we have 


(4.5.33) Ys = — th — ey 
and the system (4.5.27) reduces to 


m m 
my = Fai(y — y2) + Pu((1 + ey + mya), 
(4.5.34) 2 1 


m m 
mayo = Fio(y2 — yi) + Fa (Tn + (1 + a> we): 


Exercises 


1. In (a)-(e), take n = 1, m= 3, and m, = mz = m3 = 1. Set up the equations of 
motion in center of mass coordinates and analyze the solution. 
(a) Fyx(x) = & 
(0) Fix (a) = —@ 
() Fa) =Fis(e)=a, Fo3(x) = —2 
d) Fyo(x) = Fis(x) = —a2, 9 Fo3(x) = 2. 
) Fio(@) = Fo3(@) = —a,  Fro3(x) = 1. 


tate gay 


e 


In all cases, (4.5.2) must be enforced. 


2. For an alternative derivation of (4.5.32), when m = 2, write (4.5.1) as 


xy, 1 x, 1 
= Foi (¢ %2), = Fio(a ; 
diz m4 21(X1 £2), dt2 mo 12(x2 £1), 


and subtract, using (4.5.2). 
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4.6. Central force problems and two-body planetary motion 


As seen in §4.5, one can transform the m-body problem (4.5.1) to center of mass 
coordinates, under the hypothesis (4.5.2), and obtain a smaller system, which for 
m = 2 is given by (4.5.32). Changing notation, we rewrite (4.5.32) as 


dx 
(4.6.1 mo = F(x). 


Here « € R". We assume F € C1(R” \ 0) but allow blowup at x = 0. Under 
hypotheses (4.5.3)—(4.5.4) for the two body problem, we have 

(4.6.2 F(x) = f(\lall)2. 

In such a case, (4.6.1) is called a central force problem. Parallel to (4.5.5), we have 
F(x) =-VV(a), 

V(x) =v(llel|), v'(r) = -rf(r). 

The total energy is given by 


(4.6.3 


1 |jdx 
(4.6.4) E= 5m| =| +V(z) 
and if x(t) solves (4.6.1), then 
dE ax dz dx 


46. ees ; eee 
ae) ae a ap eae 


yielding conservation of energy. 
There are further conservation laws, starting with the following. 


Proposition 4.6.1. Assume x(0) 40, and let W C R” be the linear span of x(0) 
and x'(0). If x(t) solves (4.6.1) fort € I and (4.6.2) holds, we have 


(4.6.6) a(t)eWw, Vtel. 


Proof. One way to sce this is to note that (4.6.1) is a well posed system for x(t) 
taking values in W. Then uniqueness of solutions yields (4.6.6). Here is another 
demonstration. 


Define A € £(R”) by 
Av =v, Vuew, 


4.6.7 
) Av=-v, Wwewt. 


Note that A is an orthogonal transformation. Let y(t) = Ax(t). The hypothesis on 
he initial data gives 
4.6.8) y(0) = 2(0),  y'(0) = 2"(0). 


Also, given F(x) of the form (4.6.2), we have AF (x) = F(y), so y(t) solves (4.6.1). 
The basic uniqueness result proven in §4.1 implies y = x on J, which in turn gives 
4.6.6). 


A third proof of Proposition 4.6.1, valid for n = 3, can be obtained from 
conservation of angular momentum, established in (4.6.11) below. 
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Proposition 4.6.1 guarantees that each path x(t) solving (4.6.1) lies in a plane, 
and we can take n = 2. For the next step, it is actually convenient to take n = 3. 
Thus a(t) solves (4.6.1) and x(t) is a path in R°. We define the angular momentum 


(4.6.9) a(t) = mza(t) x a(t). 
We then have, under hypothesis (4.6.2), 
a’ (t) = ma(t) x a(t) 


= x(t) x F(x) 
(4.6.10) 
= f(llall) e@) x x(t) 
=0. 
This yields conservation of angular momentum, 
4.6.11 x(t)x a (t)=L, 
where L = x(0) x 2’(0) € R°. In case x(t) = (a1(t), x2(t), 0), we have 
4.6.12 a(t) x a'(t) = (0,0, 21 (t)x5(t) — x} (t)xo(t)), 
so the conservation law (4.6.11) gives 
4.6.13 xi(t)x}(t) — x (t)ro(t) = Ls. 


Let’s return to the planar setting, and also use complex notation, 
4.6.14 a(t) = 21(t) + ixe(t) = r(t)e™. 
A computation gives 
az’ = (r’ + ir6'e®, 
a” = [r" — r(6')? + i(2r'6 + "Je, 
so (4.6.1)—(4.6.2) becomes 
4.6.16 mlr" — (0)? + i(2r'0' + 70")] = f(r)r. 


Equating real and imaginary parts separately, we get 


4.6.15 


’ 


pt r(0')? = f(r)r 


4.6.17 
2r'd’ + r0” =0. 
Note that 
d 
(4.6.18) —(r76') = r(2r'’ fs r6"), 


dt 
so the second equation in (4.6.17) says r?6’ is independent of t. This is actually 
equivalent to the conservation of angular momentum, (4.6.13). In fact, we have 
r1=1rcos@, x2 = rsin#@, hence 


Ui 


(4.6.19) xv, =r'cos0—ré'sind, x, =r’ sind +70’ cos, 
and hence 
(4.6.20) r12y — #89 = 776! 


Thus we have in two ways derived the identity 


(4.6.21) ro =L. 
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(For notational simplicity, we drop the subscript 3 from (4.6.13).) 


There is the following geometrical interpretation of (4.6.21). The (signed) area 
A(t) swept out by the ray from 0 to x(s), as s runs from fo to t, is given by 


(4.6.22) A(t) = ay r? dO = frre) ds, 


2 (to) 2 0 
so 
1 L 
4.6.2 Al(t) = sr°0' = =. 
(4.6.23) (t) = 576 =5 
This says 
(4.6.24) Equal areas are swept out in equal times, 


which, as we will discuss below, is Kepler’s second law. 

Next, we can plug 6’ = L/r? into the first equation of (4.6.17), obtaining 
ar r)r 1? 

_ fer, 


4.6.2 —= ae 
oP dt? m r3 
This has the form 
dr 
4.6.26 ae g(r), 
reated in §1.5. We recall that treatment. Take w(r) such that g(r) = —w’(r), so 
4.6.26) becomes 
Pr 
4.6.27 we —w'(r). 
Then form the “energy” 
4.6.28 E 5(4)'+ (r) 
6. = =(— w(r), 
2\ dt ’ 
and compute that if r(t) solves (4.6.27), then 
dE drdr dr 
4.6.29 = ‘ =0, 
de ae at 1 
so for each solution to (4.6.27), there is a constant E such that 
4.6.30 ar _ 4 /IB 2 
6. not — Qw(r). 
Separation of variables gives 
4.6.31 tt+C. 


dr = 

i Eta) 
This integral can be quite messy. 

Note that dividing (4.6.30) by (4.6.21) yields a differential equation for r as a 

function of 0, 


(4.6.32) Sk 2E — 2w(r), 
which separates to 


d 
(4.6.33) Lf a = HOE. 
FF 


/2E — 2u(r) 
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Let us recall that 
fr)r _ ~P 
m rs" 


(4.6.34) w(r) =— 


Typically the integral in (4.6.33) is as messy as the one in (4.6.31). These integrals 
do turn out to be tractable in one very important case, the Kepler problem, to 
which we now turn. 


This problem is named after the astronomer Johannes Kepler, who from obser- 
vations formulated the following three laws for planetary motion. 


1. The planets move on ellipses with the sun at one focus. 


2. The line segment from the sun to a planet sweeps out equal areas in equal time 
intervals. 


3. The period of revolution of a planet is proportional to a/?, where a is the 
semimajor axis of its ellipse. 


The Kepler problem is to provide a theoretical framework in which to derive these 
three laws. This was solved by Isaac Newton, who formulated his universal law of 
gravitation, used it to derive a differential equation for the position of a planet, and 
solved the differential equation. 


Newton’s law of gravitation specifies the force between two objects, of mass m1 
and mo, located at points x; and a2 in R*. Let us say the center of the planet is 
at v and the center of the sun is at x2. In the framework of (4.5.1), this means 
specifying the vector field Fy; on R*®. The formula is 

x 
(4.6.35) Fo) (x) = = CMa a 

alls 
Here G is the universal gravitational constant. If we go to center of mass coordi- 
nates, the motion of the planet is governed by (4.5.32), yielding (4.6.1) with 


x 


(4.6.36) F(a) Hg 


Kk =G(m+ mz). 


Here m = mz, is the mass of the planet and mz is the mass of the sun. Consequently 
we have (4.6.17) with 


(4.6.37) ae 


and (4.6.25) becomes 

ar K -L? 
4.6. = =-sath. 
(4.6.38) WP 2 + 73 


Thus w(r) in (4.6.27)-(4.6.34) is given by 


(4.6.39) w(r) =—-— +—5. 
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Thus the integral in (4.6.31) is 
r dr 


(4.6.40) a 
QEr? + 2Kr — L? 


and the integral in (4.6.33) is 


d 
(4.6.41) / if 
rV2Er? +2Kr — L? 


The integral (4.6.40) can be evaluated by completing the square for 2Er?+2Kr—L?. 
The integral (4.6.41) can also be evaluated, but rather than tackling this directly, 
we instead produce a differential equation for u, defined by 


4.6.42 u= s 
r 
By the chain rule, 
dr gdu 2 du dé du 
me dt | dt | dOdt do’ 
the last identity by (4.6.21). Taking another t-derivative gives 
ar d du d?u do du 
4.6.44 L =-Lv?—_, 
de dt d0 de? dt “de? 
again using (4.6.21). Comparing this with (4.6.38), we get 
au 
2,2 _— 72,3 2 
4.6.45 “Ee eo Be 
or equivalently 
@u 


{iraculously, we have obtained a linear equation! The general solution to (4.6.46) 
is 


K 
4.6.47 u(@) = Acos(@ — 69) + 7’ 
which by (4.6.42) gives 
K 
4.6.48 rfA cos( — 69) + a = 
This is equivalent to 
i? L? 

4.6.49 r[l1+ecos(@—6)] =p, p KR’ © A K: 


If e = 0, this is the equation of a circle. If 0 < e < 1, it is the equation of an 
ellipse. If e = 1, it is the equation of a parabola, and if e > 1, it is the equation 
of one branch of a hyperbola. Among these curves, those that are bounded are the 
ellipses, and the circle, which we regard as a special case of an ellipse. 

Since planets move in bounded orbits, this establishes Kepler’s first law (with 
some caveats, which we discuss below). Kepler’s second law holds for general central 
force problems, as noted already in (4.6.24). To establish the third law, recall from 
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4.6.23) that L/2 is the rate at which such area is swept out, so the period T of the 
orbit satisfies 


L . 
4.6.50 ge = area enclosed by the ellipse 


= rab, 


where a is the semimajor axis and } the semiminor axis. For an ellipse given by 
4.6.49), we have 


_ ?P _ P eae 
, ob 

1-e?’ V1 —e?2 ae 

cf. (4.6.58)—(4.6.59) below), which yields 


4.6.51 a 


4.6.52) Pa ee 2g, VE 82 2 OE a 
L L VK 
This establishes Kepler’s third law. 

We now discuss some caveats. Our solar system has nine planets, plus numerous 
other satellites. In the calculations above, all but one planet was ignored. One can 
expect this approximation to work best for Jupiter. Jupiter has about 107? the 
sun’s mass, and its distance from the sun is about 400 times the sun’s radius. 
Hence the center of mass of Jupiter and the sun is located about 0.4 times the sun’s 
radius from the center of the sun. The sun and Jupiter engage in a close to circular 
elliptical orbit with a focus at this center of mass. Clearly, this motion is going 
to influence the orbits of the other planets. In fact, each planet influences all the 
others, including Jupiter, in ways not captured by the calculations of this section. 
Realization of this situation led to a vigorous development of the subject known as 
celestial mechanics, from Newton’s time on. Material on this can be found in [1] 
and [15], and references given there. 


Advances in celestial mechanics led to the discovery of the planet Neptune. By 
the early 1900s, this subject was sufficiently well developed that astronomers were 
certain that an observed anomaly in the motion of Mercury could not be explained 
by the Newtonian theory. This discrepancy was accounted for by Einstein’s theory 
of general relativity, which provided a new foundation for the theory of gravity. 
This is discussed in [3] and also in Chapter 18 of [45]. While a derivation is well 
outside the scope of this book, we mention that the relativistic treatment leads to 
the following variant of (4.6.46), 

2 
(4.6.53) _ +u=A+tev?, 
where A ~ K/L? and ¢ is a certain (small) positive constant, determined by the 
mass of the sun. This can be converted into the first order system 


du du 
(4.6.54) gaa ut eu 
In analogy with (4.6.26)—(4.6.29), we can form 
1,1. e.8 
(4.6.55) F(u,v) = av tau Au roe 
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and check that if (u(@),v(@)) solves (4.6.54), then 


d 
(4.6.56) prt») =0, 


so the orbits for (4.6.54) lie on level curves of F’. As long as Ae € (0,1/4), F has 
two critical points, a minimum and a saddle. Thus (4.6.54) has some solutions 
periodic in 9. However, the period is generally not equal to 27. (See Appendix 
4.E for results related to computing this period.) This fact leads to the precession 
of the perihelion of the planet orbiting the sun, where the perihelion is the place 
where u is maximal, so r is minimal. In the nonrelativistic situation covered by 
(4.6.46), all the solutions in (4.6.47) are periodic in 0 of period 27. 


——— 
Exercises 


1. Solve explicitly 
w(t) = —w(t), 
for w taking values in R? = C. Show that 
|w(t)|? + jw'(t)/? = 2B 


is constant on each orbit. 


2. For w(t) taking values in C, define a new curve by 


z(s) = w(t)’, “ = |w(t)|?. 


Show that if w’(t) = —w(t), then 
2"(s) = —-4E 


so z(s) solves the Kepler problem. 


2(s) 
|2(s)|°’ 


3. Take u = 1/r as in (4.6.42), and generalize the calculations (4.6.43)—(4.6.46) 
to obtain a differential equation for u as a function of 0, for more general central 
forces. Consider particularly f(x) = —VV(z) in the cases 


V(x) = —-K|lz|)?, V(x) = —K lz. 


4. Take the following steps to show that if p > 0 and 0 < e < 1, then 

(4.6.57) r(1+ecos0) =p 

is the equation in polar coordinates of an ellipse. 

(a) Show that (4.6.57) describes a closed, bounded curve, since 1 + ecos6@ > 0 for 


all 0 if 0 < @ < 1, and cos@ is periodic in 6 of period 27. Denote the curve by 
(0) = (x(@), y(@)), in Cartesian coordinates. 
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(b) Show that this curve is symmetric about the z-axis and cuts the axis at two 
points, whose distance apart is 


2a = r(0)+7r(7), 
sO 
Pp 
4.6. 29 
(4.6.58) a=7*5 
(c) Show that the midpoint between 7(0) and y(7) is given by 


%=—ea, yo=O0. 


d) For 7(6) = (#(@), y(@)), as in part (a), show that 
(x + ea)? y? 
a? ' be 


Hint. Use (4.6.57) and its square to write 


4.6.59 


=1, P=(1-e?)a?. 


r+ex=p, so r=p-—ex, 
4.6.60 
x? + y? + 2erx + e227 = p’, 


2 pnd 


hence a? + y? + 2e(p — ex)a + e?a? = p?, or equivalently 


4.6.61 (1 — e?)x? + 2epr + y? = p’, 


and proceed to derive (4.6.59), taking into account (4.6.58). 


5. As an approximation, assume that the Earth has a circular orbit about the sun 
with a radius 


(4.6.62) a= 1.496 x 10!" m, 

and its period is one year, i.e., 

(4.6.63) T = 31.536 x 10° sec. 

The gravitational constant G has been measured as 
(4.6.64) G = 6.674 x 10-1! m3/(kg sec”). 


With this information, use (4.6.36) and (4.6.52) to calculate the mass mz of the 
sun. Assume the mass of the Earth is negligible compared to mz. You should get 


(4.6.65) mz =a x 10°° kg, 


with a between 1 and 10. 


REMARK. Historically, T was measured by the position of the “fixed stars.”. Modern 
methods to measure a involve bouncing a radar signal off Venus to measure its 
distance, given that we have an accurate measurement of the speed of light. Then 
trigonometry is used to determine a. See [17] for a discussion of how G has been 
measured; this is the most difficult issue. 
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6. The force of gravity the Earth exerts on a body of mass m at the Earth’s surface 
is 

(4.6.66) —Gmmer~?, 

where G is given in Exercise 5, 

(4.6.67) r = 6.38 x 10° m 


is the radius of the Earth, and m, is the mass of the Earth. It is observed that 
the Earth’s gravity accelerates objects at its surface downward at 9.8 m/ sec’, so 
we have 


(4.6.68) 9.8 m/sec? = Gmer~?. 
Use this to compute m,. You should get 
(4.6.69) Me = BX 10" kg, 


with 8 between 1 and 10. 
REMARK. See Appendix 4.F for more on (4.6.66). 


7. As an approximation, assume that the moon has a circular orbit about the 
Earth, of radius 


a=3.8x 10° m, 
and its period is 27.3 days, ie., 
T = 2.359 x 10° sec. 


Assume the mass of the moon is negligible compared to the mass of the Earth. Use 
the method of Exercise 5 to calculate the mass of the Earth. Compare your result 
with that of Exercise 6. 


8. Use the data presented in Exercises 5 and 7 to calculate the ratio of the masses 
of the Earth and the sun, irrespective of the knowledge of G. 


9. Jupiter has a moon, Ganymede, which orbits the planet at a distance 1.07 x 10° 
m, with a period of 7.15 Earth days. Using the method of Exercise 5 (or 8), compute 
the mass m, of Jupiter. You should get 


my © 318 Mme. 


4.7. Variational problems and the stationary action principle 


A rich source of second order systems of differential equations is provided by vari- 
ational problems, which we will consider here. Let 2 C R” be open, and let 
LeC?(Q x R®), say L = L(a,v). For a path wu: [a,b] > Q, consider 


b 
(4.7.1) mu) = | L(u(t), u’(t)) dt. 
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We desire to find equations for a path that minimizes J(u), among all such paths 
for which the endpoints u(a) = p and u(b) = q are fixed. More generally, we desire 
to specify when wu is a stationary path, meaning that 


d 
(4.7.2) * Tus) 
ds 
for all smooth families of paths us such that uo = u, us(a) = p, and us(b) = q. Let 
us write 


=0, 


s= 


0 
—u,(t 
ds” 2 
so w: [a,b] > R” is an arbitrary smooth function such that w(a) = w(b) = 0. To 
compute (d/ds)I(us), let us denote 


(4.7.3) 


= w(t), 


s=0 


aL aL 
4.7.4 Dig La = 
ae "* Oxy? —"* Our 
Then 
d b 
a ilus)| = | So Ln, (u(t), u!(t) we (t) at 


: 
+ [ Xbn ult). a! O)up(e at 
ak 


We can apply integration by parts to the last integral. The condition that w,;(a) = 
wx(b) = 0 implies that there are no endpoint contributions, so 


p I d / 
7 | Do [Ean ul (1) — FL (u(t), ()| wz (t) dt. 


(4.7.6) © Iu) 


For this to vanish for all smooth w;, that vanish at t = a and 6, it is necessary and 
sufficient that 


(4.7.7) © Loy u(t),w (t) — Ly, (u(t), u'(t)) =0, Vk. 


This system is called the Lagrange equation for stationarity of (4.7.1). Applying 
the chain rule to the first sum, we can expand this out as 


Do Les (u(t), w(t) ult) + D2 Lopes (w(t), w(t) ub) 
L £ 


—L,,(u(t),u'(t)) =0, Vk. 


(4.7.8) 


This can be converted to a first order system for (u(t), u’(t)), to which the results 
of §4.1 apply, provided the n x n matrix 


(4.7.9) (Loee(,»)) 


of second order partial derivatives of L(#,v) with respect to v is invertible. 


The Newtonian equations of motion can be put into this Lagrangian framework, 
as follows. A particle of mass m, position x, and velocity v, moving in a force field 
F(x) = —VV(a), has kinetic energy and potential energy 


1 
(4.7.10) T= amlloll’, and V =V(a), 
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Figure 4.7.1. Pendulum 


respectively. The Lagrangian L(x, v) is given by the difference: 


1 
(4.7.11) L(z,v) =T-V= 5mllull” V(2). 
In such a case, 
(4.7.12) Ly, (2,v) = mvp, Lz, (2,0) = alae 
OxrK 


and the Lagrange system (4.7.7) becomes the standard Newtonian system 


Pu 
(4.7.13) maa = —VV(u). 
In this setting, the integral (4.7.1) is called the action. The assertion that the laws 
of motion are given by the stationary condition for (4.7.1) where L is the Lagrangian 


(4.7.11) is the stationary action principle. 


The Lagrangian approach can be particularly convenient in situations where 
coordinates other than Cartesian coordinates are used. As an example, we consider 
the simple pendulum problem, and give a treatment that can be compared and 
contrasted with that given in $1.6 of Chapter 1. As there, we have a rigid rod, of 
length @, suspended at one end. We assume the rod has negligible mass, except for 
an object of mass m at the other end. See Figure 4.7.1. The rod makes an angle 0 
with the downward vertical. We seek a differential equation for 6 as a function of 
Ls 


The end with the mass m traces out a path in a plane, which, as in Chapter 1, 
we identify with the complex plane, with the origin at the point where the pendulum 
is suspended and the real axis pointing vertically down. We can write the path as 


(4.7.14) 2(t) = bei, 
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The velocity is 


4.7.15 u(t) = 2! (t) = i£0' (te, 
so the kinetic energy is 
1 2 
4.7.16 T = 5mllo(2)|I? = “o'. 
Jeanwhile the potential energy, due to the force of gravity, is 
4.7.17 V =—mgt cos 6. 


Taking 7 = 0’, we have the Lagrangian 


me? 
L(0 — yy? 0 cos 6 
4.7.18 (0,v) 5) we + mgé cos 6, 


Ly (6, v) = mer, Lo(, wv) = —mgésin 0, 
and Lagrange’s equation 
d 
ag Lv (A(t), O(t)) — Lo (OG), 02) = 0 
yields the pendulum equation 


(4.7.19) 


d’6 
+ a sind = 0, 


(4.7.20) ree 


in agreement with (1.6.9). 

The approach above avoided a computation of the force acting on the pendulum 
(cf. (1.6.6)), and is arguably a bit simpler than the approach given in Chapter 1. 
The Lagrangian approach can be very much simpler in more complex situations, 
such as the double pendulum, which we will discuss in §4.9. 

An important variant of these variational problems is the class of constrained 
variational problems, which we now discuss. For the sake of definiteness, let IZ be 
either a smooth curve in 2 C R? or a smooth surface in Q C R°, and let n(x) be a 
smooth unit normal to M, for € M. Again, let L € C?(Q x R"), n = 2 or 3, and 
define I(u) by (4.7.1). We look for equations for 


(4.7.21) u: [a,b] — M, 


satisfying the stationary condition (4.7.2), not for all smooth families of paths us 
such that wo = u and us(0) = p, us(b) = q, but rather for all such paths satisfying 
the constraint 


(4.7.22 Us : [a,b] — M. 
Again we take w(t) as in (4.7.3), and this time we obtain an arbitrary smooth 
function w : [a,b] + R”, satisfying w(a) = w(b) = 0, and the additional constraint 
(4.7.23 w(t) -n(u(t)) = 0. 
The calculations (4.7.4)—-(4.7.6) still apply, but from here we get a conclusion dif- 
ferent from (4.7.7). Since (4.7.6) holds for all w(t) described as just above, the 


conclusion is 


d 
dt 


(4.7.24 Ly (u(t), u’(t)) — Le(u(t),u'(t)) is parallel to n(u(t)), 
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where Ly = (Ly,,.--,Lv,)' and Ly = (Le,,...,Lx,,)'. In case n = 3, an equivalent 
formulation of (4.7.24) is 


d 
4.7.25 [Eo (ult), u'(t)) — Lz (u(t), u’(t))| x n(u(t)) = 0. 
Let’s specialize this constrained variational problem to the case 
1 
4.7.26 L(x,v) = 5llell?. 


The associated integral 


1 fe 
4.7.27 E(u) = a ||u’ (¢) ||? dt 
is called the energy of wu : [a,b] + M. In this case, Ly = v and L, = 0, so (4.7.24) 
becomes 
4.7.28 u(t) is parallel to n(u(t)). 


That is, u’(t) = a(t)n(u(t)). Taking the inner product with n(t) gives a(t) = 
n(u(t)) - u(t), so (4.7.28) yields 


4.7.29 u(t) = n(u(t)) - u(t)n(u(t)). 

An equation with a better form can be obtained by differentiating 
4.7.30 u'(t) -n(u(t)) = 0, 

to get 

4.7.31 u” - n(u(t)) = —u'(t) - © n(u(t)). 


Plugging this into the right side of (4.7.29) gives the differential equation 


d 
wW / —_ 
4.7.32 u(t) + u(t) (r(u(t))) m(u(e)) =0. 
Note by (4.7.28) that u” is orthogonal to u’(t), so 


d 
4.7.33 alle’ Ol? = 2u'(t)- u(t) =0. 


Thus stationary paths wu : [a,b] > M for the energy have constant speed. 


Such curves on M are geodesics. These curves are also constant speed curves 
on M that are stationary curves for the arclength: 


b 
(4.7.34) 0(u) = / I|u’(t) || at. 


We will say a bit more about geodesics in Appendix 4.H. Further material can be 
found in Chapter 6 of [49], and also in texts on elementary differential geometry, 
such as [11]. 

We next present another approach to finding equations for stationary paths of 
(4.7.27). Suppose 2 = O x R and M is the graph of a function z = (#1, x2), for 
& = (#1, 22) € O. Then a curve wu: [a,b] > M has the form 


(4.7.35) u(t) = (#(¢), e(a(t))); 
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and 
4.7.36 ul(t) = (ay), Vo(a(t)) -«'(é)), 
so 
aaa ilu’ (e) II? = le" (I? + (Ve(a()) - (6)? 
= 2'(t)-G(a(t))«"(2), 
where 
L+yi(z)?  yi(x)p2(x) Op 
4.7.38 G(x) = ; (x) = ——. 
) Gone 1+ pala)? )? P10) = By, 
Thus the problem of finding a constrained stationary path u(t) for the energy 
4.7.27) is equivalent to the problem of finding an unconstrained stationary path 
x(t) for 
1 re 
4.7.39 E(x) = a ax’ (t) - G(a(t))x(t) dt. 


In this case, 
L(a,v) = a -G(a)v, 
(4.7.40) L,(z,v) =G(a)v, and, 
L,z(x,v) = a -VG(a)v, 


where the last identity means 


1 OG 
4.7.41 Ly, (£,v) = ee Bac 
In this setting, the Lagrange equation (4.7.7) becomes 
d 1 
4.7.42 5 [e@@)2"(] — 52'(t)- VG(e(t))2"(t) = 0, 
1:63, 
d 1 OGi; 
4.7.43 a ys Gag (x(t)) 24 (t) ; Lae) a ai(t)=0, Wk. 


Exercises 


1. Given a Lagrangian L(x, v), we define the “energy” 
E(x,v) = Ly(x,v)-v— L(2,v) 
(4.7.44) = > Ly, (x, v)vK — L(a,v). 
ke 


Show that if u(t) solves the Lagrange equation (4.7.7), then 


d E(u(t), u’(t)) = 0. 


(4.7.45) 7 
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This is energy conservation, in this setting. 


2. Suppose 
(4.7.46) L(a,v) = se -G(x)v — V(2). 


Assume G(x) € M(n,R) is symmetric and invertible, and define E(z,v) as in 
(4.7.44). Show that 


(4.7.47) E(a,v) = oe -G(x)v + V(2). 


3. Let L(x,v) be given by (4.7.46). Show that the Lagrange equation (4.7.7) is 


(4.7.48) mo [emu] = Su (t) -VG(u(t))u/(t) = -VV(u(t)), 


where the second term is evaluated as in (4.7.42)—(4.7.43). Show in turn that this 
yields the first order system 


die = 5 
dt e 
OGr; 10G; OV 
; yO a, J ij — 
m 32 Gas(ult) - moat) Ol es oe 2) u;(0) Fay ht): 


Produce a variant by symmetrizing the term in brackets in the second sum, with 
respect to zi and j. 


4. Consider the setting of constrained motion on M C Q, as in (4.7.21)-(4.7.24), 
and consider the following generalization of (4.7.26): 
4.7.49 L(a,v) = Fle? ~V(z). 


Establish the following replacement for (4.7.32): 


4.7.50 mu’ (t) + mu’ (t) - (Zn(u(t))) m(u(t) = —Py(u(t))VV (u(t), 


where, for x € M, w € R", 


4.7.51 Py (x)w = w- (n(x) -w)n(c). 


This describes motion of a particle in a force field F(x) = —VV (2), constrained to 
move on M. 


5. Motion of a spherical pendulum in R?, in the presence of Earth’s gravitational 
field, is described as in Exercise 4 with 


(4.7.52) M = {x ER’: |a|| = 4, 


and L(z,v) as in (4.7.49), with V(x) = mg(x-k), where k = (0,0,1)'. Show that 
in this case, (4.7.50) produces, for 


(4.7.53) u(t) = lw(t), 
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the system 
(4.7.54 w""(t) + |lw" (t)||2a(t) = —5k - (w(t) - k)w(t). 


6. Results of Exercise 5 are also valid in the setting where R® is replaced by R?. 
Show that, in this setting, with 


(4.7.55 w(t) = (sin 6(t),—cosA(t))’, k= (0,1), 
the equation (4.7.54) leads to the (planar) pendulum equation 
(4.7.56 6”(t) + 7 sind(t) =0. 


7. Let us return to the setting of Exercise 2, and set 


(4.7.57) p=L,(2,v) =mG(z)v. 

Also set 

(4.7.58) E(a,p) = E(a,v) = E(x,G(a)~|p/m). 
Show that 

(4.7.59) Seo) = oP SC TAL RA 


Show that the Lagrange equation (4.7.48) for u(t) = x(t) is equivalent to the 
following Hamiltonian system: 
dz, OE dp. OE 


4.7. = : = : 
mo) dt OpK dt Or, 


Hint. To get started on (4.7.60), note that if (4.7.59) holds, then 
0&1 


4.7.61 — = —G(x)"'p =», 
Op ™m 


and that the Lagrange equation implies 


dpe _ m OG OV 
4.7.62 oT Ly, (£,v) = ae" On, (x)v Dax (x). 
Furthermore, as in (3.8.13), 
4.7.63  eyq)-1 = ~G(2)-1 2S een)". 
OrE Oxp 


Remark. More general cases in which the change of variable p = L,(x,v) converts 
Lagrange’s equation to Hamiltonian form are discussed in [1], [5], and Chapter 1 
of [45]. 


Exercises 8-11 study surfaces of revolution that are surfaces of least area. To set 


this up, let u : [0,1] + (0,00) be smooth, and rotate the graph of y = u(x) about 
the z-axis in (x, y, z)-space. Elementary calculus gives the formula 


(4.7.64) A(u) = 20 A u(t)/1 + ul (t)? dt 
0 
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for the area of the resulting surface of revolution. The problem is to find wu for 
which the area is minimal, given constraints 


(4.7.65) u(0) =a, u(l)=6, a,B>0. 


8. In (4.7.64), L(a,v) = eV1+4+ v?. Show that the “energy” E(x,v) in (4.7.44) is 
given by 


(4.7.66) E(x,v) = -—= 


Jitu 


9. Using (4.7.45), show that if u(t) solves the Lagrange equation (4.7.7) in this 
setting, then there is a constant a such that 


(4.7.67) a ty 
1+u!(t)? 
hence 
(4.7.68) a aA peels. Se 
a 


10. Separate variables in (4.7.68), and use the substitution bu = cosh v to evaluate 
the u-integral and conclude that 


(4.7.69) u(t) = — cosh(bt + c), 


KS oe 


for some constant c. Equation (4.7.69 
(1.3.24), for the hanging cable. 


is the equation of a catenary, seen before in 


11. Consider the problem of finding 6 and c in (4.7.69) such that the constraints 
(4.7.65) are satisfied. Show that sometimes no solutions exist, and sometimes two 
solutions exist, but one gives a smaller area than the other. 


Exercises 12-15 take another look at the hanging cable problem mentioned in Ex- 
ercise 10. Here we state it as the problem of minimizing the potential energy, which 
is mg times 


A 
(4.7.70) V(u) = / u(t) /1+ u(t)? dt, 
-A 
subject to the boundary conditions 
(4.7.71) u(—A) = u(A) =0, 
and the constraint that the curve y = u(r), —A < a < A, have length L, 


(4.7.72) e(u) = ye J1+u(? dt = L. 


Such a curve describes a cable, of length L, hanging from the two points (—A,0) 
and (A,0), under the force of gravity. To deal with the constraint (4.7.72), we bring 
in the Lagrange multiplier method. That is, we set 


(4.7.73) Iy(u) = V(u) + A€(u), 
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find the stationary path for (4.7.73) (subject to (4.7.71)) as a function of A, and 
then find for which \ the constraint (4.7.72) holds. Note that I)(u) has the form 
(4.7.1) with 


(4.7.74) Ly(x,v) = (@ + A)V1 4+ 0?. 


12. Show that the “energy” E(x, v) in (4.7.44) is given by 


(4.7.75) E)(a,v) = 


13. Using (4.7.45), show that if u(t) solves the Lagrange equation (4.7.7) in this 
setting, then there exists a constant a (maybe depending on A) such that 


(4.7.76) Uae 
1+u/(t)? 
hence 
(4.7.77) uy =+,/b?(u+A)?—-1, b= 2, 
dt a 


14. Separate variables in (4.7.77) and use the substitution b(u + A) = coshv to 
evaluate the u-integral and obtain 


1 
u(t) = —A+ 3 cosh(bt + c), 
for some constant c. Show that (4.7.71) forces c = 0, so 


(4.7.78) u(t) = -A+ cosh bt. 


15. Calculate the length of the curve y = u(x), —A < x < A, when uw is given by 


(4.7.78), and show that the constraints (4.7.71)—(4.7.72) yield the equations 
bL 1 
(4.7.79) sinh bA = re A= 3 cosh bA. 


Note that the first equation has a unique solution b € (0,00) if and only if L > 2A. 


16. Recall the planar pendulum problem illustrated in Figure 4.7.1. Instead of 
assuming all the mass is at the end of the rod, assume the rod has a mass distri- 
bution m(s) ds, 0 < s < 4, so the total mass is m = fo m(s) ds. Show that for the 
potential energy V you replace (4.7.17) by 


¢ 
(4.7.80) V =—maglcos6, ma =f m(s)= ds, 
0 j 


and for the kinetic energy T, you replace (4.7.16) by 


(4.7.81) = De 6/(t)?, my = [m(3) 
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Figure 4.8.1. Brachistrochrone problem 


Write down the replacement for the pendulum equation (4.7.20) in this setting. 
Specialize the calculation to the case 


(4.7.82) m(s) = pe 0<s<, 


which represents a rod with uniform mass distribution. 


IA 
D 
A 
cS 


4.8. The brachistochrone problem 


The early masters of calculus enjoyed posing challenging problems to each other. 
The most famous of these is called the brachistrochrone problem. It was posed by 
Johann Bernoulli in 1696, and solved by him, by his brother Jakob, and also by 
Newton and by Leibniz. The problem is to find the curve along which a particle 
will slide without friction in the minimum time, from one given point p in the 
(x, y)-plane to another, q, starting at rest at p. Say p = (0,0) and q = (a,b). We 
assume a > 0 and b < 0; see Figure 4.8.1. The force of gravity acts in the direction 
of the negative y-axis, with acceleration g. 


Our approach to this problem will involve two applications of the variational 
method developed in §4.7. (In fact, this problem helped spark the creation of the 
variational method.) First, let y : [0,a] + R with y(0) = 0, y(a) = b, and consider 
the constrained motion of a particle, 


(4.8.1) u: [0,to] — M = {(2,y(x)):0< 2 <a}, 


under the force of gravity. Thus, in place of (4.7.27), we look for stationary paths 
for 


(4.8.2) ru) = f [Zw Ol? — Vue] at 
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subject to the constraint (4.8.1), and with 
4.8.3) V(x, y) = mgy. 


We can convert this to an unconstrained variational problem as was done in (4.7.35)— 
4.7.42), now with a nonzero V, and with lower dimension. We have 


4.8.4) u(t) = (#(¢), e(a(t))); 


and 
4.8.5) ilu’ (e)I? = (1+ g(a)? 2"(H)?, 


so the problem of finding a constrained stationary path u(t) for (4.8.2) is equivalent 
to the problem of finding an unconstrained stationary path x(t) for 


4.8.6 J(x) = 7 L(a(t), x’ (t)) dt, 


with 


4.8.7 L(z,v) = 5 (i + y'(x)*)v? — mgp(z). 

The path x(t) is governed by the differential equation 

188 © Lo(e(t),2%(t)) ~ Lela(t),2%(t)) = 0. 

We need not write this more explicitly, since by now our experience tells us that to 
describe solutions to such a single equation, all we need is conservation of energy: 


(4.8.9 E(z,v) = 


m 


be y'(x)")v? + mge(z), 


that is, for a solution to (4.8.8), 
m 
(1+ gl(a(t))?)2/(t)? + mgg(a(t)) = B 


is constant. In the current set-up, 2(0) = 0 and 2’(0) = 0, so E = 0. We get 


(4.8.10) 


dx —299(x) 
4.8.11 =+ 
co dt 1+ (a)? 


’ 


which separates to 


1 a 14+ y'(x)? to 
(4.8.12) Tal | are a= | dt. 


In other words, the elapsed time for the particle to move from p = (0,0) to g = (a, b) 
along the path y = y(z) is given by the left side of (4.8.12). 

Hence the brachistochrone problem is reduced to the problem of finding the 
function y : [0,a] —> R that minimizes 


(4.8.13) K(p)= f £(ela),6"@)ae, 
0 
subject to the condition 


(4.8.14) p(0)=0, y(a) =), 
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where 


(4.8.15) L(y, v) = 


d 
4.8.16 £ Lylela),e'(2)) — Leola), o'(a)) = 0. 
Note that 
1. /-y(1 + v? 
ei). eyo es ey ao ee, 
V-o(1 + $7) oe 
Solutions to (4.8.16) have the property that 
4.8.18 E(y(t), y(t) =E 
is constant, where (parallel to (4.7.44)) 
4.8.19 E(y,¥) = Luly.) — L(y, ¥). 
Using (4.8.15) and (4.8.17), we have 
)? 1+ y? 

e(y,W) = : 

(4.8.20) =o be") =e 
1 
V—e(L +o) 

Thus, if y(a) satisfies (4.8.16), then 
(4.8.21) y(x)(1+ ¢'(x)?) =—k?, const, 


where we have written the constant as —k? to enforce the condition that y(x) < 0 
for 0 < «<a. For notational convenience, we make the change of variable 


(4.8.22) y(x) = —9(z), 
so (4.8.21) becomes 
(4.8.23) y(x)(1+y!(2)?) =k’, 
giving 

dy ke? 
(4.8.24) ao poe 


The equation (4.8.24) separates to 


d 
fate = fax. 
Py cel 
y 
The left integral has the form of (1.5.15) in Chapter 1, with Ey = —1, Km = 


k?. Rather than recall the formulas (1.5.16)—(1.5.22), we implement the method 
previewed in Exercise 3 of §1.5. We use the change of variable 


(4.8.26) y=ksin?t, 27 =8. 


(4.8.25) 
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y 
2e + 
7] 
p 
27p = 
Figure 4.8.2. Cycloid 
Then 
k2 ; 
(4.8.27) dy = 2k? sin cost dr, —-l= say 
y sin T 
so 
d 
[_ = aK® ff sin? dr 
ry eh 
y 
k2 
(4.8.28) = — | (1-—cos6)dé 
2 
k? 
= a — sin), 


the second identity because sin? r = (1 — cos2r)/2. Thus the curve (x,y(x)), © € 
[0, a], is parametrized by 


(4.8.29) 


The choice of k? > 0 is dictated by the implication 


ae ke 
(4.8.30) 0<60< ak’, 5 (9 —sin@) =a 5 (1 — cos 6) = |b]. 


This solves the brachistochrone problem. The curve defined by (4.8.29) is 
known as a cycloid. See Figure 4.8.2. Here p = k?/2. 


REMARK. Note that y’(0) = +00, so the optimal path starts directly down. 
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Exercises 


1. Show that for each a, |b] € (0,00), there is a unique k? > 0 such that (a, |b|) € R% 
lies on the curve (4.8.29), for some 0 € (0, 7k?). 
Hint. Consult Figure 4.8.2. 


2. In the setting of Exercise 1, show that if |b|/a < 2/7, then @ > wk?/2, and the 
optimal path dips below b before reaching the endpoint q = (a, 0). 


3. With (8) and y(@) as in (4.8.29), set y(9) = —y(6). Let 
k2 
(4.8.31) ,=—T7, HE [0, 1). 


Show that the time it takes a particle starting at rest at (2(09),(90)) to slide 
down the curve (x(9), y(@)), 00 < 6 < 41, to the point (x(A1), y(61)) (the bottom of 
the cycloid) is independent of @. One says the cycloid also solves the tautochrone 
problem. 


4.9. The double pendulum 


Here we study the motion of a double pendulum, such as illustrated in Figure 
4.9.1. We have a pair of rigid rods, of lengths @; and ¢, of negligible mass except 
for objects of mass m, and mz attached to one end of each rod. The other end of 
rod 1 is attached to a fixed point, and the end of rod 2 not containing mass 2 is 
attached to rod 1 at mass 1. The rods are assumed free to swing back and forth 
in a plane. Thus the configuration at time t is described by the angles 6,(¢) and 
62(t), that the rods make with the vertical. Gravity acts on the masses m,, with a 
downward force of m,g. 

We identify the plane mentioned above with the complex plane, with rod 1 


attached to the origin and the real axis pointing down. Thus the position of mass 
1 is 


(4.9.1) z(t) = bet, 
and the position of mass 2 is 
(4.9.2) 2o(t) = 21 (t) + foe? ™, 


Their velocities are 


zi, = il)", 


4.9.3 ; 
( ) zh = il O,e" + il dhe, 


with square norms 
[zi]? = (01), 
(4.9.4) |25|? = (€:0, 6° + l2056) (EO, + L205 1) 
=> (04)? + (05)? + 201 0204, 05 cos(41 = 2). 
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Figure 4.9.1. Double pendulum 


The potential energy of this system is given by 

V =—mjg Re z1(t) — mag Re za(t) 
4.9.5 
= —m,gly cos 0; — m2g(£i cos 1 + £2 cos 42), 


and the kinetic energy by 
4.9.6 T= A)? + a(t), 
If we write 
0, Wr 0 
4.9.7 O= ; = = ; 
(2) ¥= (a) =(@ 


hen (4.9.4) gives 


4.9.8 t= i G(O)Y, 

with 

4.9.9 G(6) = e oats 0) ee ro a ; 

Thus the Lagrangian L = T — V is given by 

4.9.10 L(6,0) = xe -G(0)w — V(8), 

with V(@) as in (4.9.5), and the equation of motion for the double pendulum is 
4.9.11 < Lol. 6’) — Lo(0,0') = 0. 


As in (4.7.48), this expands out to the 2 by 2 system 


4.9.12 ay Do Gest ))6% (t) 5m) ) ees) = 99, (8): 
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for k = 1,2. Making explicit use of (4.9.5) and (4.9.9), we have 
Ly, (6, w) = (mi + my) C2 + Mob lowe cos(61 = 02), 


(4.9.13) ‘ 
Ly, (9, wv) = meo2l5xW2 + moly low cos(0, = 2), 

and 

(4.9.14) Lo, (0,0) = —moei lod 2 sin(O, — 82) — (mi + ma) gli sin 1, 


Leo, (6, v) = mb lowe sin(91 _ 02) — moglo sin Oo. 
Thus the explicit version of (4.9.11)—(4.9.12) is the pair of equations 


d 
(m4 + m2) C0), <- meobilo— [a cos(O1 = 6)| 


(4.9.15) 
= —mol l01,05 sin(@1 = 02) = (my + m2)get sin 01, 
and 
2g dy, 
ae) BOR+ bibs [a cos(6, — 6.)| 


= £6501, 05 sin(61 _ 02) xg gle sin Oo. 
Note that the masses m, and m2 do not appear in (4.9.16); m1 does not appear in 
either term of (d/dt)Ly, — L9,, and mg factors out. 
As in (4.7.44)-(4.7.47), we have the energy 


1 
(4.9.17) B(0,) = 5 GO)b+V(), 
and if A(t) solves (4.9.11), or equivalently (4.9.15)-(4.9.16), then 


(4.9.18) 4 57(6(t),0/(t)) = 0. 


dt 
By (4.9.5) and (4.9.9), the explicit form of the energy is 

1 
E(6, wv) => 3 (™ + my) ei? + mb lowe cos(O, = 02) 


(4.9.19) 1 
+ ginal — mygly cos 01 — mag(l1 cos 61 + £2 cos 02). 


As in (4.7.57)-(4.7.60), we can convert the equations of motion to Hamiltonian 
form, by setting 
(4.9.20) p=G(0)y. 
The energy (4.9.17) becomes 
E(0,p) = B(0,G(0)*p) 
(4.9.21) 7 a 
= 5p G0) "p+ VO), 
and (4.9.11) is equivalent to 
dO, — O€ dp. OE 
4.9.22 = ; =-—. 
( 2 ) dt OpK dt 06; 
Note that, for G(@) given by (4.9.9), 


_ 1 mob —mMz6 12 cos(01 — 62) 
1_ alo abi lo 1-4 
928) GR) N= det G(0) (Sone cos(01 — 2) (m1 + m2) : 
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and 
4.9.24) det G(0) = mim223 + m2 sin? (01 — 02). 


For notational simplicity we write 
1 
4.9.25) E(0,p) = gh H(#)p+V(6), H(#)= Ge)". 


Solutions to (4.9.22) are orbits of the flow generated by the Hamiltonian vector 
field 


X¢(0,p) = —JIV 6 E(9, Pp) 


4.9.26) = & 0) ($e) 


_ ( Vre 
~ \-VWoE/ 
Here J € M(2,R) is the identity matrix and J € M(4,R) is defined by the second 


identity in (4.9.26). From this formula we see that the critical points of X¢ coincide 
with the critical points of €. Note that 


4.9.27 Vp€(0,p) = H(6)p, 
and H(6) is invertible for all 0, so if € has a critical point at (0,p), p =0. Now 
4.9.28 VoE(0,0) = VV(8), 


so we deduce that (6, p) is a critical point of X¢ if and only if p = 0 and VV(@) = 0. 
Rewriting (4.9.5) as 


4.9.29 V(6) = —(mi + m2)ge1 cos 61 — maglz cos 2, 


we see that 


4.9.30 VV (0) = (m+ me)gey sin 6, 
mogl sin 02 


so the critical points of V consist of 6; = jm, 02 = km, j,k € Z. In summary, the 
critical points of X¢ consist of 


(4.9.31) (01, 02, P1, p2) = (jn, kr, 0, 0), Zk EZ. 


Towards the goal of understanding the behavior of X¢ near these critical points, 
we examine its derivative. We have 


(4.9.32) DX¢(6,0) = (_ pve a) , 


The matrix H(0) is positive definite for all 6, and in particular, since sinja = 0 


and cos jm = (—1)/, 


ae,  HGraje—! biety m3 ee 


mymy7l? I—k+ Imo ly by (my + mz) e? 


Also, 


(4.9.34) DV (jn, km) = Ce ma) gb nines) : 
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We are set up to examine the linearization of the flow generated by X¢ at the 
critical points. This will be pursued, in a more general setting, in the next section. 


ee 
Exercises 


1. Pass to the limit m2 — 0 in the double pendulum system (4.9.15)—(4.9.16) and 
derive the limiting system 
ot + 7 sind =0, 
1 
Ly d Ly 


ee a [a cos( ~ @2)] = 7 


(4.9.35) 


6/64, sin(0, — 62) — isn Gs. 


2. Recall the spherical pendulum, introduced in Exercise 5 of §4.7. Derive equations 
of motion for a double spherical pendulum. 


3. Instead of assuming all the mass of rods 1 and 2 is concentrated at an end, 
assume that rod j has mass distribution m;(s)ds, 0 < s < 4;, so the total mass 
of rod j is mj = if? m,;(s)ds, 7 = 1,2. Obtain formulas for the potential and 
kinetic energy, replacing (4.9.5) and (4.9.6), and then obtain equations of motion, 
replacing (4.9.15)—(4.9.16). 

Note. See Exercise 16 in §4.7 to get started. 


4.10. Momentum-quadratic Hamiltonian systems 


Most of the Lagrangians arising in the last three sections have been of the form 
1 
(4.10.1) L(a,v) = gt G(x)u — V(2), 


for x € 2 CR", v E R”, where G(x) € M(n,R) is symmetric and invertible, in fact 
positive definite, but for now we will work in this more general setting. As exercises 
in §4.7 have revealed, we can make the change of variables (x,v) > (a,p) with 
p = G(x)v and convert the Lagrange system of differential equations to Hamiltonian 
form, 

dz, OE dp OE 


(4.10.2) = Bet gp oan 


where 
(4.10.3) E(x,p) = 5p: H(2)p + V2), H(x) = G(a)71. 


We call such systems momentum-quadratic Hamiltonian systems. Note that H(z) is 
also symmetric and invertible, and furthermore positive definite if G(x) is. Solutions 
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of (4.10.2) are orbits of the flow generated by the Hamiltonian vector field 
X¢e(x,p) = —IVapE(x,p) 


(4.10.4) ? (“, 0) (ve) 


— { Vp€ 

~ \-VeE) 
Here, I € M(n,R) is the identity matrix, and J € M(2n,R) is defined by the 
second identity in (4.10.4). 

We record some general results about the critical points of such fields, and their 
linearizations. To begin, the critical points of X¢ coincide with the critical points 
of €. Note that 
(4.10.5) VpE (2, p) = H(x)p, 


so, since H(z) is invertible, we see that if € has a critical point at (x, p), then p = 0. 
Now 


(4.10.6) V2E(x,0) = VV (a), 
so we deduce that the critical points of X¢ consist of 
(4.10.7) {(x,0) : VV (a) = O}. 


We next look at the linearization (cf. (4.3.39)) of X¢ at a critical point (xo, 0), 
given by 


(4.10.8) DX¢(xo,0) = eaves a ; 


From here on, we assume H (9) is positive definite. For notational simplicity, we 
set 


0 HA 
(4.10.9) H=H(ao), W=D?V(ao), L= ee i : 
Then the linearization of (4.10.2) at (xo, 0) is 
dx dp 
(4.10.10) quip, == -We. 


To analyze the structure of solutions to (4.10.10), it is convenient to directly 
tackle the second order system 
dx 
(4.10.11) Tae —HWz, 


and to do this we bring in the following. 


Lemma 4.10.1. Given that H € M(n,R) is positive definite, there exists a positive 
definite A € M(n,R) such that 


(4.10.12) H=A?’. 
Proof. From Chapter 2 we know that R” has an orthonormal basis {v;} of eigen- 


vectors of H, so Hu; = Ajvj, 1 <j <n. Each A; is positive, so we can define A 
by Avj = /Ajuj, 1S 7 <n. 
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If we make the change of variable 


(4.10.13) x= Ay, 
then (4.10.11) is converted to 
(4.10.14) y” + AW Ay = 0. 


Note that W € M(n,R) is symmetric and so is AWA. Also AWA is invertible 
if and only if W is. This invertibility is equivalent to the assertion that (20,0) 
is a nondegenerate critical point of X¢. We restrict attention to such cases. The 
following result will be useful. 


Lemma 4.10.2. Let W € M(n,R) be a symmetric matriz, and assume 

(4.10.15) W has k positive andn—k negative eigenvalues. 

Then so does AWA, when A € M(n,R) is positive definite. 

Proof. Write R” = W, 6 W_, where W4, is the linear span of the eigenvectors 
of W with positive eigenvalue, W_ the linear span of the eigenvectors of W with 


negative eigenvalue. Similarly, write R" = Wi 6 W_, with W replaced by AW A. 
The image AW + of W, under A is a linear subspace of R”, and 


4.10.16 v= Aw AW, v-Wv=w:-AWAw > 0S vEewy. 
Thus 

4.10.17 A:Wy — W4, injectively, 

so 

4.10.18 dim W, < dimW,. 

A similar argument gives 

4.10.19 dimW_ < dimwW_, 


and finishes the proof. 
To continue, under the hypotheses of Lemma 4.10.2, we have an orthonormal 

basis {u1,...,Un} of R” such that, with uw; € (0,00), 

AW Au; =pu;, j<k, 

(4.10.20) Piet «we 

AW Auj = —pjuj, J >k. 


In such a case, the general solution to (4.10.14) is 
y(t) = Soa sin pt + b; cos pyt)u; 
U<k 


+ So (ajet* + bje"3*)u;. 
j>k 


(4.10.21) 


Such y(t) leads to 


(4.10.22) tan = ey a (*): 


for general vp, v1 € R”. As a result, we have the following. 
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Proposition 4.10.3. Under the hypotheses of Lemma 4.10.2, L, given by (4.10.9), 
is diagonalizable, and its eigenvalues are 


tip; for j<k, 


ee tp; for j>k. 


Proof. The eigenvalues of Z are what appear in the exponents in the matrix co- 
efficients of e'”. If L were not diagonalizable, some matrix coefficients would also 
contain terms of the form t’e, @ > 1, where p = ij; or +p; in (4.10.23), 
depending on j. 


A critical point of X¢ is said to be hyperbolic if all of the eigenvalues of DX¢ 
have nonzero real part. From the analysis above, we have the following. 


Proposition 4.10.4. A critical point (x9,0) of X¢ is hyperbolic if and only if 
(4.10.24) D?V (aq) is negative definite. 


If (4.10.24) holds, DX¢(xo,0) has n positive eigenvalues and n negative eigenval- 
ues. 


Whenever a vector field X (Hamiltonian or not) has a hyperbolic critical point, 
say at 2g, the phase portrait near zg for the flow generated by X has a similar 
appearance to that for the flow generated by its linearization at zg. This is a gen- 
eralization of the two dimensional result mentioned below (4.3.78). See Appendix 
4.C for further discussion. 


The opposite extreme can also be read off from (4.10.23). 


Proposition 4.10.5. At a critical point (xo,0) of Xe, all the eigenvalues of DX¢ 
are purely imaginary if and only if 


(4.10.25) D?V (a9) is positive definite. 

Recalling that E(«,p) is given by (4.10.3), we see that (4.10.25) is equivalent 
to 
(4.10.26) D?E(a9,0) € M(2n,R) is positive definite, 


in which case € has a local minimum at (20,0). 


In case (4.10.25) holds, we can deduce from (4.10.21)—(4.10.22), with k = n, 
that the orbits of e“” all lie in n-dimensional tori. As for the flow generated by X¢ 
itself, we know that its orbits all lie on level surfaces of €. Near (x,p) = (20,0), 
these level sets look like (2n — 1)-dimensional spheres in R”. In case n = 1, these 
are closed curves in R?, and indeed the phase portrait for the flow generated by 
Xe near (xo,0) looks like that for the flow generated by its linearization. In such 
a case, (%9,0) is a center, discussed in §3. In case n > 1, the orbits of the flow 
generated by X¢ near (20,0) do not necessarily lie on n-dimensional tori. The 
analysis of this behavior is much more subtle than in the case of hyperbolic critical 
points. There will be n-dimensional invariant tori that are invariant under the flow, 
arising rather densely near (29,0), but the flow generated by X¢ often has chaotic 
behavior on the complement of these tori. Study of this situation is part of the 
deep Kolmogorov-Arnold-Moser (KAM) theory. Discussion of this, and references 
to further work, can be found in [1], Chapter 8, and [5], Appendices 7-8. 
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For n > 2, there can be cases intermediate between those covered by Proposi- 
tion 4.10.4 and those covered by Proposition 4.10.5. 


Proposition 4.10.6. If (%o,0) is a critical point for X¢ and 
(4.10.27) D?V(ao) has k positive eigenvalues and n—k negative eigenvalues, 


then 
DX¢(x0,0) has 2k imaginary eigenvalues, and 


4.10.28 
( ) n—k positive, andn—k negative eigenvalues. 


In such cases, with k > 1 and n > 2, the phase portrait for the flow generated 
by Xe near (xo, 0) will generally differ from that of its linearization in important 
details, with some exceptions, arising when X¢ is integrable. We refer to the sources 
cited above for more on this. 

Let us specialize these results to the case of the double pendulum, discussed 
in §4.9. There V was given by (4.9.29), and the critical points by (4.9.31), ie., 
(j7,k7,0,0), and D?V (jr, km) by (4.9.34). We have 


j and k even —> D?V (jx, kr) positive definite, 
(4.10.29) j and k odd — D?V (jz, km) negative definite, 
j and k of opposite parity —> D?V (jz, km) indefinite. 


In the first case Proposition 4.10.5 applies, in the second case Proposition 4.10.4 
applies, and in the third case Proposition 4.10.6 applies, with k = 1 andn—k=1. 


SSE 
Exercises 


1. Establish analogues of Propositions 4.10.3, 4.10.5, and 4.10.6 in case H is allowed 
to be indefinite (nondegenerate), and we assume 


(4.10.30) D?V(ao) is either positive definite or negative definite. 


Exercises 2-6 deal with the 2 x 2 system 
a (x 
de (5) = —VayV(z, Yy), 


for various functions V. The associated energy function, as in (4.10.3), is 


4.10.31) 


1 
4.10.32) E(x,y,P,9) = 5 ("+ 4°) + V(2,y)- 


In each case, do the following. 

a) Find all the critical points of €. 

b) Determine the type of each critical point of €. 

c) Determine the behavior of the eigenvalues of DX¢ at each such critical point 
via Proposition 4.10.6). 
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2. Take 

V(x, y) = (cos x)(cos y). 
3. Take 

Viey=2? +eyty". 
4. Take 

View =2* + ayy. 
5. Take 

V(a,y) = 2*-ay+y* 
6. Take 


V(a,y) = a4 —a?y ty? 


7. Do analogues of Exercises 2-6 with (4.10.32) replaced by 


1 
(4.10.33) E(x,y,p,9) = 5(P° — ¢) + V(2,y)- 


Now Proposition 4.10.6 will not apply, but Exercise 1 might (or might not). 


4.11. Numerical study—difference schemes 


We describe some ways of numerically approximating the solution to a system of 
differential equations 


(4.11.1) =Fi(x), x(to) =2o. 


Higher order systems can be transformed to first order systems and treated by these 
methods, which are known as difference schemes. 


To start, we pick a time step h and attempt an approximation to the solution 
to (4.11.1) at times to + nh: 


4.11.2 Ln © £(to + nh). 
Noting that a smooth solution to (4.11.1) satisfies 
x(t +h) = x(t) + ha’ (t) + O(h?) 
= x(t) + AF (x(t)) + O(h?), 
we have the following crude difference scheme: 


4.11.4 In41 =In +hF (an). 


4.11.3 


This is said to be first order accurate, meaning that over an interval of unit length 
one carries out 1/h such operations, each with error O(h?), giving an accumulated 
error O(h), i.e., on the order of h to the first power. This method of approximating 
the solution x(t) is often called the Euler method, though considering what a great 
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master of computation Euler was, it is hard to believe he actually took it seri- 
ously. Shortly we will present a fourth order accurate method, which is generally 
satisfactory, after describing some second order accurate methods. 


These better difference schemes will be suggested by higher order accurate 
methods of numerical integration. The connection between the two comes from 
rewriting (4.11.1) as 


h 
(4.11.5) x(t+h) = a(t) + / F(a(t+s)) ds. 
0 
Consider methods of approximating 


h 
(4.11.6) | g(s) ds 


better than hg(0) + O(h?), for smooth g. Two simple improvements are 


h F 

(4.11.7) 5 [9(0) + 9(n)| + 01%), 
the trapezoidal method, and 

h 3 
(4.11.8) ha(5) + O(n"), 
the midpoint method. These lead respectively to 

h 
4.11.9) a(t+h) = a(t) +5 [F@®) + F(a(t+ h))| + O(h?) 
and 

A 3 

4.11.10) a(t +h) = x(t) +hF(2(t+5)) + O(h?). 


Neither of them immediately converts to an explicit difference scheme, but in 
4.11.9) we can substitute F(X(t + h)) = F(X(t) + hF(X(t))) + O(h?) and in 
4.11.10) we can substitute F(X(¢+h/2)) = F(X(t) + (h/2)F(X(t))) + O(h?), to 
obtain the second order accurate difference schemes 


R [F (xn) + F(2n + hF(en))] 


4.11.11) Ln4t1 = In + : 


and 


h 
4.11.12) nai = tn + NF (en +5F ttn)). 


Often (4.11.11) is called Heun’s method and (4.11.12) a modified Euler method. 


We now come to the heart of the matter for this section. The Runge-Kutta 
scheme for (4.11.1) is specified as follows. The approximation x, to z(to + nh) is 
given recursively by 


h 
(4.11.13) Tr41=Int+a (Kn + 2Kn2+ 2Khn3 + Rag 


6 
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where 
Kn = F(zn), 
1 
Kn2= F(2n Tr 5hKn), 
(4.11.14) 
1 
Kn3 = F(2n Fale 5hKn2), 


Kna = F(an a hKn3). 


This scheme is fourth order accurate. It is one of the most popular and important 
difference schemes used for numerical studies of systems of differential equations. 
Before proving that it works, we make some comments about its derivation. 


We will consider a method of deriving fourth order accurate difference schemes, 
based on Simpson’s formula 


: h h 
(4.11.15) | als) ds = *(9(0) +49(5) +.(h)) +008), 
0 
This formula is derived by producing a quadratic polynomial p(s) such that p(s) = 
g(s) at s = 0, h/2, and h, and then exactly integrating p(s). The formula can be 
verified by rewriting it as 


h 
h 
(4.11.16) ii G(s) ds = 3 [G(-h) + 4G(0) + G(H)] + 000°). 
-h 
The main part on the right is exact for all odd G(s), and it is also exact for G(s) = 1 
and G(s) = s?, so it is exact when G(s) is a polynomial of degree < 3. Making a 
power series expansion G(s) = yea a;s) + O(s*) then yields (4.11.16). 
Now, write the equation (4.11.1) as the integral equation (4.11.5). By (4.11.15), 


(4.11.17) [ F(X(t+s)) ds = "I r(x(n)+4e(x (t+) +e (x (+n) +0(h°). 


We then have as an immediate consequence the following result on producing ac- 
curate difference schemes. 

Proposition 4.11.1. Suppose the approximation 

(4.11.18) x(t +h)» a(t) + ®(2(t),h) = (a(t), h) 


produces a jth order accurate difference scheme for the solution to (4.11.1). If 
j <8, then a difference scheme accurate of order j + 1 is given by 


(4.11.19) i Sat *[Flen) + AF (4 (an, *)) + F(X(en, n))). 


Furthermore, if x(t +h) © X¢(ax(t),h) both work in (4.11.18), = 0,1, then you can 
use 


h h 
(4.11.20) titi = tnt F [F@n) + 4F (% (zn, 5) + F(4 (tn, n))| 
We apply this to two second order methods derived before: 


h 
(4.11.21) Xo(tn,h) = an + 5 [F(@n) + F(an + hF(en))| , Heun, 
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and 


h 
(4.11.22) XK (an, h) = tn +hF (xn + =F(en)), modified Euler. 


Thus a third order accurate scheme is produced. The last term in (4.11.20) becomes 
(4.11.23) 


7 [# (en) +4F (0m + 2 [Flan) + F (ant SF)]) +F (an thE (an + SF), 


where F = F(xz,). In terms of Kyi, Kn2 as defined in (4.11.14), we have 


(4.11.24) f [Kai +4F (en 4 Baa + Kya}) + F(tn + hKna)]- 


6 


This could be used in a third order accurate scheme, but some simplification of the 
middle term is desirable. Note that, for smooth H, 


(4.11.25) A(x n) 1 H(z) SH +) + O(|nl?). 


Consequently, as | — Ky2| = O(h), by (11.14), 


(4.11.26) F(2n+ aan +Kna}) = oF tnt " Kn) + 5F (ant " Kya) +0(h*). 


Therefore we have the following. 


Proposition 4.11.2. A third order accurate difference scheme for (4.11.1) is given 
by 


h 
(4.11.27) Tn41 = In a glint Ss 2Kn2 te 2Kn3 ee Eyal 
where Kni, Kno, Kn3 are given by (4.11.14) and 
(4.11.28) Lya = F (tn +hKn2). 


We can now produce a fourth order accurate difference scheme by applying 
Proposition 4.11.1 with V(x, h) defined by (4.11.27). Thus we obtain the difference 
scheme 


h h 
Tn4+1 = Xn a {Km ‘de 4F (an + yon + 2kna + 2kng + nal) 


(4.11.29) 6 h 
+ F(2n + gn + 2Khn2 + 2Khn3 4 Inal)}, 
where K,;, Ln4 are as above and 
Ri F(z, "Kin)) 
(4.11.30) pie F(a + na) 
Ga F(2n m2) 


This formula is more complicated than the Runge-Kutta formula (4.11.13). It 
requires 9 evaluations of F rather than just 4. Rather than try to modify (4.11.29) 
mod O(h®) to get (4.11.13), we will instead look at how (4.11.27) fails to be fourth 
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order accurate, by expanding (4.11.27) in powers o 
power series expansion, note that 


h, through h*. To begin this 


h (a h? MW T72 hs MM Tr3 4 
Kyo =F+=F'Kyy F'R?, + —F"K3, + O(h4), 
(4.11.31) ; a ee 
Ky3 =F + 5 Kno : F"K?, + ae Kna4 + O(h*), 
and 
(4.11.32) L F+AF’ Ky. 4 ie F"K? ie —F'"K3, +0 
. . na = n27 2 n2 ap 6 n2 = (h *), 


To simplify notation, we have set F = F(a,), F’ = F'(an), F"” = F" (an), etc. The 
meaning of, e.g., F’” K, is of F’/” = D?F(a,), as a symmetric trilinear form (with 
values in R? if (4.11.1) is a d x d system) acting on the triple of vectors K, K, K, 
as arises in multivariable power series; see [49], Section 2.1. Note that Ky, = F. 
A straightforward substitution yields 


h 
6 [Kn t 2Kn2 t 2K n3 t Ina] 
h2 
AF + 6 F’ [Kn t 2Kna| 
4.11.33 
+E pelixg 43K? 
6 4 ni 4 n2 
ht i 
FF") — K3 > KS + O(h*). 
6 Fr mit 94 2) Ve) 
To proceed further, note from (4.11.31) that 
h / nh? WT? 3 
(4.11.34) Kyo = Knit af Knit 3 F'K7, + O(h’). 


Further substitution into (4.11.33) then shows that (4.11.33) is equal to 


h? 
AB + FE + hEF + MPP) 


(4.11.35) 
3 3 hn? 1 
gf [F? + qe eles re + O(h°). 
We want to match 
2 3 4 
(4.11.36) feral a O(h’). 
2 6 24 


To start, the coefficients of h match, by the differential equation (4.11.1). Further 
differentiation of (4.11.1) yields 


1 ply! 
Wo Py !? 4 Bll 
(4.11.37) =F"? + FP e 
ch) = Bg! 4 OF 2" 2! + Fea" 4+ FM! . Fla! + FR" 2? 
= FF? 43F"F'F.F + FSF + F'F"F?. 
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Now a comparison shows that (4.11.35) and (4.11.36) match up mod O(h*), but 
ht 
4.11.38) (4.11.36) = (4.11.35) + rime + O(h°). 


Thus we will get a fourth order accurate scheme if we can modify L,4 by adding 
h3/4)F °F + O(h*). Recalling that Lyg = F(an + AF (an + (h/2)F)), note that 


4.11.39 F(an +hF(an + oR t h?A)) = Lng th3F?A + O(h4), 
so we pick A = (1/4)F’F + O(h), so h?.A = (h?/4)F’F + O(h?). Now 
fist Ai? h 

4.11.4 F F'F = =Ky2 + O(h’), 

0 5 + 7 ihre O(h") 
by (4.11.34), so 

h n3 13; 4 

4.11.41 F (ay +hF (an + nz) = Da BOP + O(h*). 


Now the left side of (4.11.41) is precisely F(a, + hKy3) = Kna. Thus we have 
produced the Runge-Kutta scheme (4.11.13) and shown that it is fourth order 
accurate. 

We have dealt specifically with autonomous systems in (4.11.1), but a nonau- 
tonomous system 


(4.11.42) o =G(t,x), x(to) = x0, 


can be treated similarly, as one can see by writing its autonomous analogue 


d (x ee) Cea) (*") 
4.11.43 = = , = , 
oe Oe ewe yc 
and applying the formulas derived above to (4.11.43). 


We move briefly to another class of difference schemes, based on power series. 
It derives from the expansion 


(4.11.44) a(t +h) = a(t) + ha'(t) + me") bree 2 () + O(h**1). 
To begin, differentiate (4.11.1), producing 

(4.11.45) v(t) = Fo(az,2’),  Fo(x,2") = DF(a)z’. 

Continue differentiating, getting 

(4.11.46) tO (t)= F,(a,a’,...,29-)), g<k. 


Then one obtains a difference scheme for an approximation x, to x(to + nh), of the 
form 


, he ” hk (k) 
(4.11.47) En41 =n the, + aon feet Fin 
where 
(4.11.48) xv, =F(tn), v= Fo(2n,2)), 


and, inductively, 


(4.11.49) ae) = F;(an,2/,,..- ,eG-D), 
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This difference scheme is kth order accurate. In practice, this is not usually a 
good method, because the formulas for F; tend to become rapidly more complex. 
However, in some cases the functions F; happen not to become very complex, and 
then this is a good method. 


To mention a couple of examples, first consider the central force problem 


av =, 
y =u, 
(4.11.50) vf = —a(a? + 42)-9? 


x 


w! = -y(0? + y2)-9?2, 


Here, the power series method is not nearly as convenient as the Runge-Kutta 
method. On the other hand, for the pendulum problem, which for g/€ = 1 we can 
write as 


(4.11.51) =, y =—sind, 


we have 


"=, w" = -wvcos8, 
(4.11.52) 3) =", pS = -y' cos6 + Wi? sin 8, 
A) = YS), YH = a" cos + 3'v sind + w? cos 8, 


from which one can get a workable fourth order difference scheme of the form 
presented in (4.11.47)—(4.11.49). 


There are other classes of difference schemes, such as predictor-corrector meth- 
ods, which we will not discuss here. More about this can be found in numerical 
analysis texts, such as [7] and [39]. 

Readers with a working knowledge of a general purpose computer program- 
ming language, such as FORTRAN or C, will find it interesting to implement the 
Runge-Kutta method on a variety of systems of differential equations, including 
(4.11.50) and (4.11.51). Be sure to use double precision arithmetic, which makes 
computations to 16 digits of accuracy. Alternatively, specialized programming tools 
such as MATLAB and Mathematica can be used. These tools have built-in graphics 
capability, with which one can produce phase portraits, and they also have built- 
in differential equation solvers, whose output one can compare with the output 
from one’s own program. Useful literature on these latter tools for the study of 
differential equations can be found in [37] and [13]. 


When running such programs, pay attention to the way solutions behave when 
the step size h is changed. As a rule of thumb, if the solution does not change 
appreciably when the step size is halved, the solution is accurate. To be sure, there 
is frequently more to obtaining accurate solutions than just choosing a small step 
size. In many cases, various stability issues arise. One such case is discussed in 
Appendix 4.H, concerning an instability for geodesic equations, arising in §4.7, and 
how to handle it. 

For more on such matters, we recommend numerical analysis texts, such as 
cited above, and of course we also recommend lots of practice on various systems 
of differential equations. 
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Exercises 


The following exercises are for readers who can use a programming language. 


1. Write a program to apply the Runge-Kutta method to the pendulum equation, 
given in (4.11.51). 


2. Write a program to apply the power series method described in (4.11.44)— 
(4.11.49) to (4.11.51). Produce a fourth order accurate method. 


3. Consider applying the Runge-Kutta scheme to the problem of motion in a planar 
force field, 


(4.11.53) a" = f(x,y), y” =9(2,y), 
which can be written as the first order system 

a =v, v= f(x,y), 
y=w, w'=g(2,y). 
Show that (4.11.13)-(4.11.14) in this context become 


(4.11.54) 


h 
creat gv t 202 + 2u3 + va), 
h 
yoyt ric + 2we + 2w3 + wa), 
(4.11.55) h 
verut g (a + 2a + 2a3 + aa), 
h 
W + we g (O t 2bo 2b3 Tr ba), 
where aj;,b;,v;, and w; are computed as follows. First, 
(4.11.56) a=f(x,y), b1 =9(2,y); 
then 
h h 
rg=2 Vv, Yo=ytsu, 
(4.11.57) : 
vg =U+ ae w2=wt 5 ob 
and 
(4.11.58) dz = f(t2,y2), be = g(x2, y2); 
then 


h 
3 =x V2, Ys = Yt We, 


h 

2 

(4.11.59) i 
2 


v3 =u + =a2, w3 = w+ ba, 
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and 
4.11.60 a3 = f(r3,y3), 63 = g(x,y); 
hen 
t=x+hv3, Yya= hws, 
PERS », =vut is, i” So ae 
and finally, 
4.11.62 a4 = f(x4,ys), ba = g(x4, ya). 
Write a program to implement this difference scheme. Test it for various functions 
f(x,y) and g(x,y). Consider particularly 
4.11.63 fle,y) = Ga g(x,y) = “Gia ae 


arising in the Kepler problem, (4.11.50). 


4. Extend the scope of Exercise 3 to treat 
c= f(a,ye',y'), y" =9(2,y,0',y'). 


5. Write a program to apply the Runge-Kutta method to the double pendulum 
problem (4.9.15)—(4.9.16). 


6. Use a power series method to produce a sixth order accurate difference scheme 
for the Airy equation, 


7. Peek ahead at the Volterra-Lotka system 
xv =-axr+ory, 
yl =ry— Kary. 
Here, a,o,r, and « are all positive constants. 
(a) Write a program to apply the Runge-Kutta method to this system. 


(b) Use a power series method to produce a fourth order accurate difference 
scheme for this system. 


4.12. Limit sets and periodic orbits 


Let F be a C! vector field on an open set O C R”, generating the flow 6. Take 
x € O. If ©'(x) is well defined for all ¢ > 0, we define the w-limit set L,,(a) to 
consist of all points 


(4.12.1) y € O such that there exist t, 7 +00 with 6" (x) > y. 
Similarly, if ®4(x) is well defined for all t < 0, we define the a-limit set L(x) to 
consist of all points y € O such that there exist ty \y —oo with ®'*(x) > y. Sinks 


are w-limit sets for all nearby points. Other examples of w-limit sets are pictured 
in Figures 4.12.1 and 4.12.2. In Figure 4.12.1, L,,(a) is a periodic orbit, i-e., for 
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Figure 4.12.1. Limit cycle 


L,,(z) 


Figure 4.12.2. Another limit set 


some T € (0,00), ®7(y) = y. In Figure 4.12.2, L,,(z) is a figure eight, containing 
a hyperbolic critical point of the vector field. The reader might think up examples 
in which L,,(a) contains several hyperbolic critical points. 


The next proposition records some general observations about limit sets that 
hold in any dimension. Take O C R”, F, and ®! as above. 


Proposition 4.12.1. Assume that K C O is a closed, bounded (hence compact) 
set in R” and that ®'(K) C K for eacht > 0. Take x € K. Then 


(4.12.2) L(x) is a nonempty, compact subset of K, 


4.12. Limit sets and periodic orbits 297 


given by 

La(x) = () {®@(x) :t > s} 
(4.12.3) aes 

= (){(8@) tS hy 
keN 

We have 
(4.12.4) 6'(L.(x)) =Lu(x), Vt>0, 
and hence 
(4.12.5) 6: L(«) > L(x), VtER. 
Furthermore, 
(4.12.6) y € L(x) => La ly) C La (2). 


Proof. The result (4.12.3) is a straightforward consequence of the definition of 
L(x). The fact that this set is nonempty follows from Proposition 4.B.6, in Ap- 
pendix 4.B. The results (4.12.4)-(4.12.6) are left as exercises. O 


We now specialize to planar vector fields, where w-limit sets tend to have rather 
special properties. The following result, characterizing w-limit sets without critical 
points in planar regions (under a few additional hypotheses), is called the Poincaré- 
Bendixson theorem. 


Theorem 4.12.2. Let O be a planar domain, and let F generate a flow ®' on O. 
Assume there is a compact set K C O that satisfies &'(K) C K for allt > 0. Take 
xe K. If L(x) contains no critical point of F, then it is a periodic orbit of ®. 


An important ingredient in the proof of the Poincaré-Bendixson theorem is the 
following classical result about closed curves in the plane. 


Jordan curve theorem. Let C be a simple closed curve in R?, i.e., a continuous, 
one-to-one image of the unit circle. Then R? \ C consists of two connected pieces. 
Any curve from a point in one of these pieces to a point in the other must cross C. 


We will not present a proof of the Jordan curve theorem. Proofs can be found 
in [14], §18, and in [32]. We do mention that actually we will need this result 
only for piecewise smooth simple closed curves, where a simpler proof exists; see 
[42], pp. 34-40, [45], Chapter 1, §19, or [49], §5.3. The ability of a simple closed 
curve to separate R” fails for n > 3, which makes the Poincaré-Bendixson theorem 
an essentially two-dimensional result. Examples discussed in §4.15 illustrate how 
much more complex matters can be in higher dimension. 


To tackle Theorem 4.12.2, first note that the hypotheses imply L(x) is a 
nonempty subset of K. Let y € L,,(x), and say 
(4.12.7) Yr= (x), ty ZA+00, Yroy. 


We have F(y) 4 0. Let T be a smooth curve segment in O, containing y, such that 
the tangent to T at y is linearly independent of F'\(y). Shrinking I if necessary, we 
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Figure 4.12.3. Curve transverse to orbits of &¢ 


can assume that for each z € I, the tangent to T at z is linearly independent of 
F(z). We say F is transverse to T; cf. Figure 4.12.3. 


With yx, as in (4.12.7), we can assume all y, are sufficiently close to y to lie in orbits 
through IT, and adjusting each t, as needed, we can take 


(4.12.8) mel, Wk. 


At this point, is is useful to revise the list {t;,} slightly. Let t1 € Rt, y: = ®"'(x) be 
as above. Now let tz 7 +00 denote all the successive times when ©'(zx) intersects 
I, so we may be adding times to the set denoted t, in (4.12.7). Shortly we will 
show that (4.12.7) continues to hold for this expanded set of points y, = ®!* (x). 
First, we make the following useful observation. 


Lemma 4.12.3. With tj < tj41 <tj42 as above, 


(4.12.9) yj+1 lies between yj; and yj42 on TI. 


Proof. Consider the curve C; starting at y;, running to yj;+1 along ®*(x), t; < 
t < t)41, and returning to y; along I’. See Figure 4.12.4. This is a simple closed 
curve, and the Jordan curve theorem applies. 

Now for s and o small and positive, and z € I’, not on the opposite side of yj41 
from y;, we have ®*(y;+1) = ®+8(x) and @~%(z) in the two different connected 
components of R? \ Cj. Since {®*(y;41) : s > 0} cannot cross C; at any point but 
a point in I’, we must have 


O°" (yj42) = B97 (ax) 


in the opposite component of R? \ C; from that containing such ®~?(z), so y;+2 
must be on the opposite side of yj+1 from y; in T. 
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Figure 4.12.4. Orbit of &¢ 


Having Lemma 4.12.3, we see that the expanded set of points {y,} C I inter- 
laces the original set, so (4.12.7) continues to hold. We see that the convergence 
of y, to y is monotone on IT. If by chance some y; = y, then y, = y for all k > 7. 
Otherwise, all the points y, lie on the same side of y, i.e., on the same connected 
component of I \ {y}. 


The main thing we need to establish to prove Theorem 4.12.2 is that the orbit 
through y is periodic. The next result takes us closer to that goal. 


Lemma 4.12.4. Suppose s > 0 and ®*(y) ET. Then ®*(y) = y. 


Proof. We have 


(4.12.10) sup ||®*(y,) — ®*(y)|| =e, 30, as ko. 
O<t<stl 
It follows that there exist 5, > 0 such that &°+% (y,) €T', and hence 
(4.12.11) + (y,) = yryocay, for some €(k) € {1,2,3,...}. 
Thus 
(4.12.12) ®*(y) = lim 7% (yy) = lim yxrpony =Y 
k-00 k- oo 


as asserted. 


We are ready for the endgame in the proof of Theorem 4.12.2. Let s; 7 +-oo 
and consider z; = ®*(y). We have each z; € K, and passing to a subsequence, we 
can assume 


(4.12.13) 27 =O (y) 4 zEK, 


We have F(z) 4 0, so there is a curve segment r through z, transverse to F’. 
Adjusting s;, we can arrange 


(4.12.14) 2; € Ty. 
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Figure 4.12.5. March of z; = ®%i(y) > z 


We need only two such points in such a curve T; say, upon relabeling, 
(4.12.15) 2 =O (y), 2 = B2(y) = B(x) ET, 
See Figure 4.12.5. 


Note that 
(4.12.16) GETS (~) > x, 


so we can use the previous results, with t, replaced by t, + 1 and y by 2, andT 
by I. In this case, the analogue of the hypothesis in Lemma 4.12.4 applies: 


(4.12.17) 82-5, >0, 87° (z) ET. 

The conclusion of Lemma 4.12.4 is 

(4.12.18) O81 (21) = 2, 

ie., actually zg = z1. (The same argument gives z;41 = 2; for all j, so actually 
z= 2.) 


Thus the orbit of ® through y is periodic, of period s = s2—s1. Since y € L,,(2), 
it follows that this periodic orbit is contained in L,,(2). 

Note that if ®*(y) = y, then we can apply (4.12.11) to deduce that the times 
ty, arising in ®* (x) = y, satisfy 
(4.12.19) lim sup (t,41 — tr) < s. 

k—-o0o 

The last point to cover in the proof of Theorem 4.12.2 is that the periodic orbit 
through y contains all of L,,(a). Indeed, let # be another point in L,,(x). We have 


(4.12.20) ®7i(x) —> 9, 1; +00. 
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Using (4.12.19), we can write Tj = te ton, O < og < 8 +1, and passing to a 
subsequence obtain 


(4.12.21) 7 (x) = &** (y,) —> &°(y), 


hence y = ©7(y). This completes the proof of Theorem 4.12.2. 


The following equation, known as the van der Pol equation, illustrates the 
workings of Theorem 4.12.2. The equation is 


(4.12.22) x” —p1—2)e'+a=0. 


Here pz is a positive parameter. This models the current in a nonlinear circuit that 
amplifies a weak current (|z| < 1) and damps a strong current (|z| > 1). See the 
exercises for more on this. The equation (4.12.22) converts to the first order system 


(4.12.23) gay, y =—a2+p(1—2?)y. 


Figure 4.12.6 is a phase portrait for the case 4 = 1. The vector field F’ associated 
with (4.12.23) has one critical point, at the origin. The linearization of (4.12.23) 
at the origin is 


(4.12.24) a(,) = e - (*). 


and the eigenvalues of this matrix are 


(4.12.25) P+ 


1 

=V pe —4. 
ae hae 
Thus the origin is a source whenever > 0. It is a spiral source provided also 
pt < 2. Note that when (a(t), y(t)) solves (4.12.23), 


d 
dt 


which is > 0 for |z| < 1, and in particular is > 0 near the origin. 


(4.12.26) 


(x? + y?) = 2u(1 — 2?)y?, 


An examination of Figure 4.12.6 indicates the presence of a periodic orbit, 
attracting all the other orbits. Let us see how this fits into the set-up of Theorem 
4.12.2. To do this, we need to describe a closed bounded set K C R? such that 
'(K) Cc K for all t > 0, where ®* is the flow generated by F’, and such that F has 
no critical points in K. We construct K as follows. Look at the orbit of F starting 
at the point A on the positive y-axis, shown in Figure 4.12.6 and again in Figure 
4.12.7. 


A numerical integration of (4.12.23) (using the Runge-Kutta scheme) shows that 
6'(A) winds clockwise about the origin, 

(4.12.27) and again hits the positive y-axis 
at the point B, lying below A. 


To this path from A to B, one adds the line segment (on the y-axis) from B to A, 
producing a simple closed curve C. It follows readily from (4.12.23) that on this line 
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Figure 4.12.6. Van der Pol limit cycle 


Figure 4.12.7. Van der Pol orbit 


segment the vector field F' points to the right. Thus the closed region K bounded 
by this curve has the invariance property 


(4.12.28) B(K) CK, Vt>0. 
We then pick ¢ > 0 small enough (in particular < 1), and set 


(4.12.29) K=K\{(a,y):2? +y? <7}. 


4.12. Limit sets and periodic orbits 303 


The fact that 
(4.12.30) 6'(K) CK, Vt>0 


follows from (4.12.28) and (4.12.26). We have removed the only critical point of F, 
so K contains no critical points, and Theorem 4.12.2 applies. 


It must be said that the validity of the argument just given relies on the accuracy 
of the statement (4.12.27) about the orbit through A. Here we have relied on 
a numerical approximation to that orbit. We applied the Runge-Kutta scheme, 
described in §4.11, with step sizes h = 10-7, 10-3, and 10-4, using double precision 
(16 digit) variables, and got consistent results in all three cases. The last case 
involves quite a small step size, and if one were to use 8 digit arithmetic, there 
could be a danger of accumulating truncation errors. In any case, with today’s 
computers there is no point in using 8 digit arithmetic. 


Theorem 4.12.2 is a special case of the following result. 


Bendixson’s theorem. Let F be a C! vector field on O C R?, generating a flow 
®!. Assume there is aset K C O that is a closed, bounded subset of R? and satisfies 
®'(K) C K for allt > 0. Assume F' has at most finitely many critical points in K. 
Then if « € K, L,,(z) is one of the following: 

(a) acritical point, 

(b) a periodic orbit, 

(c) acyclic graph consisting of critical points joined by orbits. 


A proof can be found in [10], Chapter 16, or in [29], Chapter 10. Note that 
alternative (c) is illustrated in Figure 4.12.2. We emphasize that both this result 
and Theorem 4.12.2 are results for planar vetor fields. In higher dimension, matters 
are completely different, as we will discuss in $4.15. 

We recall a device already used to deal with alternative (a), and develop it a 
little further. Suppose F is a C1 vector field on O C R", and there is a function 
V ¢ C(O). Assume V has a unique minimum, at p € K. If z(t) = (20), then, 
by the chain rule, 


4.12.31 “V(a(t)) = VV(a(t)) - F(2(t)). 

If also V has the property 

4.12.32 VV(y)-F(y) <0, VyEeO\p, 

we say V is a strong Lyapunov function for F’. In such a case, 
4.12.33 “V(a(t)) <0, whenever x(t) 4p. 
If we replace (4.12.32) by the weaker property 

4.12.34 VV(y)-F(y) <9, Vy EO, 


we say V is a Lyapunov function for F’. In such a case, 


d 
4.12.35 av) <0, Vt>0. 
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Thus, as ¢ 7 +00, V(ax(t)) monotonically approaches a limit, Vo, which must be 
> V(p), and furthermore, 


: d 
(4.12.36) jm a (@) =0. 
This has the following immediate consequence. 


Proposition 4.12.5. Let F be aC’ vector field on O C R”, generating a flow ®'. 
Assume there is a set K C O that is a closed, bounded subset of R” and satisfies 
6'(K) C K for allt > 0. Take x9 € K. Assume V € Cl(O) is a Lyapunov 
function for F. Then 


(4.12.37) L.,(ao) Cc {ye O: VV(y)- F(y) = Of. 
If V is a strong Lyapunov function, then 
(4.12.38) L.,(%o) = {p}. 


eee 
Exercises 


1. Let O C R” be open and assume 2 C O is a closed bounded set with smooth 
boundary 00, with outward pointing normal n. Let F be a C! vector field on O, 
generating the flow ®'. Assume 


(4.12.39) F-n<0 on ON. 
Show that 
(4.12.40) 6'(Q) CO, Vt>0. 


Compare Exercises 4-5 of §4.3. 


2. In the setting of Exercise 1, show that 


(4.12.41) 6'(Q) Cc 68(Q) for 0<s<t. 

Set 

(4.12.42) B=() H#MQ= 1) #Om). 
teRt kezt 

Show that 

(4.12.43) 6'(B)=B, Vt>0. 


REMARK. It can be shown from material in Appendix 4.B that B is nonempty, 
closed, and bounded. 


3. In the setting of Exercise 2, show that 
(4.12.44) VxeQ, Lux) CB. 
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Figure 4.12.8. RLC circuit with nonlinear resistance 


4. In the setting of Exercise 2, show that 
(4.12.45) divF <0 on 02 => Vol(B) = 0. 


5. In the setting of Exercise 4, assume that n = 2, that Q C R? is an annulus, and 
that F has no critical points in 2, so by Theorem 4.12.2 there is a periodic orbit 
of ® in Q. Show that, due to (4.12.45), there can be only one periodic orbit of ® 
in 2. 

Hint. Feel free to use the Jordan curve theorem. 


Exercises 6-8 deal with a nonlinear RLC circuit, as pictured in Figure 4.12.8. The 
setup is as in §1.13 (see also Chapter 3, §3.5), except that Ohm’s law is modified. 
The voltage drop across the resistor is given by 


(4.12.46) V=f(d), 


where f can be nonlinear, and not necessarily monotonic. As an example, one could 
have 


(4.12.47) f(D = u(r = 1). 


Vacuum tubes and transistors can behave as such circuit elements. The voltage 
drop across the capacitor and the inductor are, as before, given respectively by 


dl Q 
(4.12.48) Vala, Vas. 


Units of current, etc., are as in $1.13. 


6. Modify the computations done in (1.14.1)—(1.14.7) of Chapter 1 and show that 
the current I(t) satisfies the differential equation 
@I fi(tdl 1. EMt) 


dt? L dt ro = L 


(4.12.49) 
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Show that rescaling J and t leads to (4.12.23), when f (J) is given by (4.12.47) and 
E =0. More generally, rescale (4.12.49) to 


d2 


(4.12.50) a + f'(2) 


dx 
a + xe = g(t). 

7. Assume g = 0 in (4.12.50). Parallel to (4.12.23), one can convert this equation 
to the first order system 


vay, yl =-a— fi'(x)y. 


Show that you can also convert it to the first order system 
(4.12.51) 


This is called a Lienard equation. 


8. Show that if (x(t), y(t)) solves (4.12.51), then 


(4.12.52) <a? + y?) =—22f (a). 


4.13. Predator-prey equations 


Here and in the following section, we consider differential equations that model pop- 
ulation densities. We start with one species. The simplest model is the exponential 
growth model: 


(4.13.1) a = ax. 

Here x(t) denotes the population of the species (or rather, an approximation to 
what would be an integer valued function). The model simply states that the rate 
of growth of the population is proportional to the population itself. The solution 
to (4.13.1) is our old friend x(t) = e*‘x(0). This unbounded increase in population 
is predicated on the existence of limitless resources to nourish the species. An 
alternative to (4.13.1) posits that the resources can support a population no greater 
than K. The following is called the logistic equation: 


dx 
(4.13.2) Fe ax(1— bax), 


where b = 1/K. In this model, (4.13.1) is a good approximation for small x, but 
the rate of growth slows down to 0 as x approaches its upper limit kK. The equation 
(4.13.2) can be solved by separation of variables: 
dx 

4.13. ——_ =adt. 
eis) x(1 — ba) - 
The reader can perform the integration as an exercise. 

The function F(x) = ax(1—b2) on the right side of (4.13.2) is a one-dimensional 
vector field, with critical points at z = 0 and = 1/b. The intervals (—oo, 0), (0,1/), 
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2 
0 1/b 
Figure 4.13.1. Phase portrait for logistic equation 


and (1/b,0oo) are all invariant under the flow generated by F, although only the 
interval (0, 1/b) has biological relevance. See Figure 4.13.1 for the phase portrait. 


We turn to a class of 2 x 2 systems called predator-prey equations. For this, we 
set 
x(t) = population of predators, 
(4.13.4) y(t) = population of prey, 
¢(y) = rate at which each predator consumes prey. 
Depending on the choice of the exponential growth model or the logistic model for 


the species of prey in the absence of predators, the following systems arise to model 
these populations: 


d: 
a =-azr+ b¢x, 
(4.13.5) a 
oa SEY Ca, 
or 
e =-azr+ bz, 
(4.13.6) te 
dy =ry(1—cy) —¢x 
dt u a . 
Here, a,b,c, and r are positive constants. As for the rate of feeding z, we assume 
(4.13.7) C= C(y). 
Clearly, if y = 0, then ¢ = 0. One possibility that is used is 
(4.13.8) C(y) = Ky, 


for some positive constant x. This posits that the rate of feeding of a predator 
is proportional to the rate of close encounters of that predator with members of 
the other species, which in turn is proportional to the population y. This seems 
intuitively reasonable if y is not large, but most creatures stop eating once they are 
full, so a more reasonable candidate for ¢(y) might be as pictured in Figure 4.13.2, 
representing a feeding rate bounded by (6. 


A class of functions of this sort is given by 


(4.13.9) (y=, ==8. 
“y 


Another class is 


(4.13.10) C(y) =8(1—e7%), By =r. 


Let us examine various cases in more detail. 
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Figure 4.13.2. Rate of feeding curve for predator-prey system 


Volterra-Lotka equations 


The case (4.13.5) with ¢ given by (4.13.8) produces systems called Volterra- 
Lotka equations: 


dx 

ae —ax+oxry, o = 0k, 
(4.13.11) di 

me YT EY: 
Note that the z-axis and y-axis are invariant under the flow defined by this system. 
We have x’ = —azx on the z-axis and y’ = ry on the y-axis. It follows that the 


first quadrant, where x > 0 and y > 0, is invariant under the flow. This is the 
region in the (z,y)-plane of biological significance. The vector field V(z,y) = 
—ax+oxy,ry — Kry)* has two critical points. One is the origin. Note that 


4.13.12) DV (0,0) = & ) 

so the origin is a saddle. The other critical point is 
4.13.13) (xo, yo) = ee *). 
Note that 

4.13.14) DV (zo. yo) = O iy ; 


with purely imaginary eigenvalues, so we have a center for the linearization of V at 
Zo; yo). In fact, (xo, yo) is a center for V, as we now show. 


From (4.13.11) we get 


dy — y(r—Kx) 
4.13.15) dx x(ay—a)’ 
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alo+ 


Figure 4.13.3. Phase portrait for Volterra-Lotka system 


which separates to 


a ie 
(4.13.16) ( = a dy = (< = r) de. 
Integrating yields 
(4.13.17) oy —alogy=rlogx—Ka+C. 
We deduce that the following smooth function on the region x,y > 0, 
(4.13.18) H(x,y) =oy—alogyt+«x—rlogz, 
is constant on orbits of (4.13.11), i.e., these orbits lie on level curves of H. Note 
that 
K-45 2 4 0 
(4.13.19) VH(z,y) = o *), D°H(a,y)= 0 4): 
-$ rr 
hence, with (ao, yo) as in (4.13.13), 
(4.13.20) VA (20, Yo) = 0, D H (xo, Yo) = 0 al> 
¥ 


the latter matrix being positive definite, so H has a minimum at (20, yo), which 
implies that (29, yo) is a center for V. The phase portrait for orbits of (4.13.11) is 
pictured in Figure 4.13.3. 


The system (4.13.11) was studied independently by Lotka and Volterra around 
1925, by Lotka as a model of some chemical reactions and by Volterra as a predator- 
prey model, specifically for sharks preying on another species of fish. Volterra made 
the following further observation. Bring in another type of predator, fishermen. 
Assume the fishermen keep everything they catch and that the probability of getting 


310 4. Nonlinear systems of differential equations 


caught in their nets is the same for sharks and their prey. Then the system (4.13.11) 
gets revised to 


dx 

de =—-axr+oxry— ex, 
(4.13.21) de 

— =ry— Kry — ey. 

dt y y y 


Now (4.13.21) has the same form as (4.13.11), with a replaced by a+e and with r 
replaced by r — e, all these constants remaining positive as long as 


(4.13.22) O<e<r. 


Then the previous analysis applies. The system (4.13.21) has a stable critical point 
at 


T=£ OQ 
(4.13.23) (71,91) = (—*,—). 


Note that at this critical point there are fewer sharks and more prey, compared to 
(4.13.13). Of course, this depends on the hypothesis (4.13.22). If e > r, things are 
catastrophically different. 


First modification 


We turn from Volterra-Lotka equations to predator-prey models given by (4.13.6), 
still keeping (4.13.8). Then we have the following system: 


dx 

a —ax+oxry, o = br, 
(4.13.24) d 

| =ry(1—cy) — Kary. 
As with (4.13.11), the z-axis and y-axis are invariant under the flow defined by this 
system. We have x’ = —az on the z-axis and y’ = ry(1—cy) on the y-axis. Again, 


the first quadrant (« > 0,y > 0) is invariant under the flow. Note furthermore 
that, for 


4.13.25 V(a,y) = (-ar + oxy, ry(1 — cy) — Kay)’, 
we have 
4.13.26 V(2, *) = (¢ a)z, “2, 


which points downward for x > 0. It follows that 


1 
4.13.27 R= {(w,y):2>0,0<y<-} 


c 
is invariant under this flow. It is this region in the (z, y)-plane that is of biological 
significance. 

To proceed, we find the critical points of V(a,y), given by (4.13.25). Two of 
these are 


(4.13.28) (0,0) and (0, -: 
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DV (0,0) is again given by (4.13.12), so (0,0) is a saddle. Also, 


1 -at+tZz 0 
(4.13.29) Dv (0,-) = ( a” 4 
V has a third critical point, at 
a Tr ca TC (oO 
(4.13.30) yo =<, m= (1 Dp a). 


Note how this point is shifted to the left from the point (4.13.13). There are three 
cases to consider. 


CasEI. o/c—a<0. 

In this case, the critical point (4.13.30) is not in the first quadrant, so V has only the 
critical points (4.13.28) in R. In this case (4.13.29) has two negative eigenvalues, 
so the critical point (0,1/c) is a sink. Note that the z-component of V(z,y) is 


(4.13.31) x(oy — a) < «(2 ~<a), for@>0, y< . 


so V points to the left everywhere in R except the left edge. Consequently, the 
population of predators is driven to extinction as t + +00, whatever the initial 
condition. 


Case Il. o/c—a>0. 

In this case the third critical point (a9, yo) is in the first quadrant. In fact, yo = 
a/o < 1/c, so (%,yo) € R. Now (4.13.29) has one positive and one negative 
eigenvalue, so the critical point (0,1/c) is a saddle. As for the nature of (29, yo), 
we have 


—a+ ayo OXo 
DV (co, yo) = ( ) 
—K r(1 — 2cyo) — Ka 
(4.13.32) a, ( Yo) 0 
_ ( On ES ) 
Note that 
det DV (xo, Yo) = =(= - a) > 0, 
(4.13.33) ee 
Tr DV (20, yo) = — m <0. 


It follows that the eigenvalues of DV(zo, yo) are either both negative or have neg- 
ative real part. Hence (20, yo) is a sink. 


We claim that the orbit through each point in R not on the x or y-axis ap- 
proaches (29, yo) as t > +00. To see this, we construct a Liapunov function. We do 
this by modifying H(z, y) in (4.13.18), which has a minimum at the point (4.13.13), 
to one that has a minimum at the point (4.13.30). We take 


(4.13.34) (x,y) = oy —alogy+«x r(1 <) log x. 


If (a(t), y(t)) solves (4.13.24), a computation gives 


d ~ TE 2 
5 H(0,y) = — "(oy — a)? 


(4.13.35) 
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alo+ 


xo r/k 


Figure 4.13.4. First modification of Volterra-Lotka system (Case II) 


By Proposition 4.12.5, if we take any point p € R, with positive x and y-coordinates 
(so it is in the domain of H), the w-limit set of p satisfies 


(4.13.36) Lu(p) C {(,y) ER:y= <}. 


The right side is a horizontal line to which V is clearly transverse except at the 
critical point (xo, yo), so indeed L.,(p) = (x0, yo)- 


See Figure 4.13.4 for a phase portrait treating Case II. 


Case III. o/c—a=0. 

In this case (xo, yo) = (0, 1/c). In (4.13.29) the eigenvalues are 0 and —r, so (0, 1/c) 
is a degenerate critical point. In place of (4.13.31) we have that the z-component 
of V(a, y) is 


; 


(4.13.37) u(oy—a) <0, for r>0,y< i 
c 


and it is strictly negative for x > 0, y < 1/c. Hence, as in Case I, the population 
of predators is driven to extinction as t + +00. 
Second modification 


We now move to the next level of sophistication, using the system (4.13.6) with 
¢ = C(y), described as in Figure 4.13.2. Thus, we look at systems of the form 


(4.13.38) dt 
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As before, a,b,c, and r are all positive constants. To be precise about what we mean 
when we say ¢(y) behaves as in Figure 4.13.2, we make the following hypotheses: 


(a) ¢: [0,00) > [0,00) is smooth, 
(b) — ¢(0) = 9, 
(4.13.39) (c) (yy >0, Vy>0, 
(dq) sup ¢(y) = B < 00, 
(ce)  ¢"(y) $0. 
All these conditions are satisfied by the examples (4.13.9) and (4.13.10). Hypothesis 
c) implies ¢ is strictly monotone increasing, and hypothesis (e) implies ¢ is concave. 
In this case, the vector field is 
4.13.40) V(a,y) = (w(¢(y) — a), ry(1 — ey) — 2¢(y))'. 
Parallel to (4.13.26), 


sss) v(o4) = ((e(2) -e)or-e(2)) 


which points downward for x > 0, and again it follows that the region R, given by 
4.13.27), is invariant under the flow ®* generated by V, for t > 0, and this is the 
region in the (z, y)-plane that is of biological significance. 


Next, we find the critical points of V(x, y). Again, two of them are 
1 
(0,0) and (0, =); 
and again DV (0,0) is given by (4.13.12), so (0,0) is a saddle. This time, 
)-a 0 ) 
Civ ei 


1 
Cc 
f these coordinates satisfy 


(4.13.42) Dv(o, *) = 
) 
b 


a 


(4.13.43) S(¥o) = 5 


Under the hypotheses (4.13.39), the first equation in (4.13.43) has a (unique) solu- 
tion if and only if 


(4.13.44) ; <p. 


Xo af wold cyo). 


From here on we will assume (4.13.44) holds, and leave it to the reader to consider 
the behavior of the flow when (4.13.44) fails. Given (4.13.44), zo and yo are well 
defined by (4.13.43). Parallel to the study of (4.13.30), again we have three cases. 


CasE I. 1—cyo < 0, 
CasE II. 1—cyo >0, 
CasE III. 1—cyo =0. 


In Case I, (xo, yo) is not in the first quadrant, and in Case III, (29, yo) = (0,1/c). 
Again we leave these cases to the reader to think about. We concentrate on Case 
Il. 
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In Case II, 2 > 0 and 0 < yo < 1/c, so 
(4.13.45) (x0, yo) € R. 
Given ¢(yo) = a/b and the hypotheses (4.13.39) on ¢, we have 


1 1 
(4.13.46) ¢( ) > ; — > Wo > 1- cy > 0, 


and hence in Case II, DV(0,1/c) has one positive eigenvalue and one negative 
eigenvalue, so 


(4.13.47) (0, =) is a saddle. 
(es 


(In Case I, the eigenvalues of DV(0,1/c) are both negative, so (0,1/c) is a sink, 
and in Case III these eigenvalues are 0 and —r.) Next, a computation gives the 
following analogue of (4.13.32): 


_ (C(yo) — @ bC' (yo) x0 
(4.13.48) ea ( —C(yo) — r(1 — 2eyo) — = tye) 
os 0 b¢’ (yo) xo 


= & r(1 — 2cyo) — Pas) , 
and parallel to (4.13.33) we have 

det DV (x0, Yo) = ax06' (yo) > 0, 

Tr DV(x0, yo) = r(1 — 2eyo) — xo’ (yo) 


(4.13.49) 

=r[-cyo + (1 — eyo) {1 Cele), 
Let us set 
(4.13.50) Zq <1 — $ huo)¥o. 


(Yo) 
Given ¢, this is a function of a/b, but it is independent of c and r. Note that, since 


¢(0) = 0, 


4.13.51) se =(C'(g), for some 7 € (0, yo), 

0 
by the mean value theorem, so the hypotheses on ¢ in (4.13.39) imply 
4.13.52) 0<2Z <1. 


Note that in the context of the previous model, with ¢(y) given by (13.8), Zo = 0.) 
We have 


4.13.53) Tr DV (20, yo) = 7[Zo(1 — cyo) — eyo]. 


This gives rise to three cases. 


CasE IIA. Zo < cyo/(1 — cyo). 
Then Tr DV (20, yo) < 0, so, by (4.13.49), 


(4.13.54) (to, yo) is a sink. 
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1c 


Yo 


Figure 4.13.5. Invariant domain 


CASE IIB. Zp > cyo/(1 — cy). 
Then Tr DV (29, yo) > 0, so, by (4.13.49), 


(4.13.55) (Zo, Yo) is a source. 


CasE TIC. Zo = cyo/(1 — cyo). 

Then Tr DV (zo, yo) = 0, so, by (4.13.49), the eigenvalues of DV (zo, yo) are purely 
imaginary numbers (nonzero). In this case, (xo, yo) is a center for the linearization 
of V. 


We will concentrate on Cases ITA and IB. Before pursuing these cases further, 
we want to describe a family of bounded domains in R that are invariant under the 
flow ®* for t > 0. Namely, consider the triangle 7,, with vertices at (0,1/c), (0,0), 
and (4,0), as pictured in Figure 4.13.5. 


Claim. If ,. > 0 is large enough, the triangle 7,, is invariant under ®*, for t > 0. 


Proof. Note that V is vertical on the left edge of 7,,, with critical points at the 
endpoints of this line segment. Also V points horizontally to the left on the bottom 
edge of 7,,. It remains to show that V points into 7,, along the line segment from 
(0,1/c) to (44,0), provided p is sufficiently large. This line segment is given by 


ale 


(4.13.56) v=p(l—cy), O<y< 
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and the vector 
1 
(4.13.57) Ny = ( ) 
uc 


is normal to this segment, and points away from 7,,. We want to show that V-N,, < 
0 along this line segment, for 4 large. Indeed, from (4.13.40), 


V(u(1 — cy), ¥) > Nu = (1 — ey) [w(6(y) — a) + pery — p?c¢(y)] 
= w(1—cy)[—a + ery — (uc — b)¢(y)], 
and under the hypotheses (4.13.39) on ¢, this is 


(4.13.58) 


(4.13.59) <0, Vye [o. ab 
Cc 


if yw is sufficiently large, say > po. 


A similar computation shows that, if j41 > jo, then, for each p € R, ®'(p) € Ty, 
for all sufficiently large t. 


Back to Cases IIA and IIB, as we have seen, in Case IIA (20, yo) is a sink. It 
is possible to show that 


4.13.60) in Case IIA, ®*(p) — (x0, yo), as t— +00, 


for all p in the interior of R, so the phase portrait has qualitative features similar 
o Figure 4.13.4. On the other hand, in Case IIB, (9, yo) is a source. Hence there 
is an open set U containing (9, yo) such that 


4.13.61) Tuo \U_ is invariant under 6’, for t > 0. 


This region does contain the two critical points (0,0) and (0,1/c), on its boundary, 
but since they are saddles, the argument used to establish the Poincaré-Bendixson 
heorem, Theorem 4.12.2, shows that 


4.13.62) in Case IIB, L,,(p) is a periodic orbit, 


for all p 4 (xo, yo) in the interior of R. The phase portrait is depicted in Figure 
4.13.6. 


eC 
Exercises 


Exercises 1-5 deal with the system (4.13.37), i-e., 

x’ = —ax + bxC(y), 

y' =ry(1— cy) — #¢(y), 
where ¢(y) is given by (4.13.9), ie., 


(4.13.63) 


(4.13.64) fi se a Bee 
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1/ce 


Yo 7 


pa) 
Figure 4.13.6. Second modification of the Volterra-Lotka system (Case IIB) 
As usual, a,b,c, &,y,7 € (0,00). The exercises deal with when Cases I-III, specified 


below (4.13.44), hold. Recall these cases apply if and only if there is a critical point 
(Zo, yo) given by (4.13.43), ie., if and only if 


K 


a 
(4.13.65) poea 


We will assume this holds. 
1. Show that the critical point (xo, yo) is given by 


(4.13.66) Yo = 


b 
ro = me cyo). 


2. Show that 
Case I= > ac > bk — ay, 
Case II ==> ac < bk — ay, 
Case HI = > ac = bk — ay. 


3. Let Zp be given by (4.13.50), i-e., 


sf 
(4.13.67) Zp = 1 — &{yo}¥o. 
(yo) 
Show that 
(4.13.68) Fiat 
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4. In Case II, recall Cases ITA-IIC, specified below (4.13.53). Show that 


Case IITA ts S : 
bk — ay — ac 

Case IIB ad = ; 
Kk” 6bK—ay—ac 

Case IIC es : ; 
kK bK—ay-—ac 


5. Let us take 
(4.13.69) a=1, b=2, Kc=1, y=. 
Note that (4.13.65) holds. Show that 
Case l= > c> 1, 
Case I => c <1, 
Case HI => c= 1. 


In Case II, show that 
Case ITA => c > 


Case IIB => c< 


Case IIC => c= 


Wl wl wl 


Figure 4.13.6 was produced using the parameters in (4.13.69), together with c = 
1/4, r=1. 
Exercises 6-10 deal with the system (4.13.63), where ¢(y) is given by (4.13.10), ie., 


(4.13.70) C(y) =B8(1—e7-%), By =k. 


Again there is a critical point (zo, yo), given by (4.13.43), if and only if (4.13.65) 
holds. We assume this holds, so bG > a. 


6. Show that the critical point (xo, yo) is given by 


1 b b 
(4.13.71) yo = — log B Xo = —ryo(1 — cyo). 
a 


(4.13.72) Z=1 log 


8. Parallel to Exercise 2, study when Cases I-III hold. 
9. Parallel to Exercise 4, study when Cases IIA—IIC hold. 


10. Take a,b, «,, and y as in (4.13.69). Work out a parallel to Exercise 5. 
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For Exercises 11-12, consider the following system, for x predators and y prey, 
presented in [44], p. 376: 


it Saal bs = ; 
(4.13.73) ( 3) 
y’ =ry(1— cy) — a¢(y). 


Here the equation for y is as in (4.13.63), modeling the population of prey in terms 
of the logistic equation, modified by how fast the prey is eaten. The equation for x 
has a different basis, a sort of logistic equation in which the population y determines 
the population limit of x, at any given time. 


11. Work out an analysis of the system (4.13.73) as parallel as possible to the 
analysis done in this section for (4.13.63). 


12. Take ¢(y) as in (4.13.64) and work out results parallel to those of Exercises 
1-5. 


Exercises 13-15 are for readers who can use numerical software, with graphics 
capabilities. 


13. The following system is known as the basic model of virus dynamics (cf. [34], 
p. 100, [52], p. 26): 


o =A-—dzx— Brv, 
(4.13.74) “ ee 
“ = ky — uv 


Here, x represents the uninfected cell population, y the infected cell population, 
and v the virus population. The positive parameters \,d,(,a,k, and u are taken 
to be constant. The ratio 

_ ABk 

~ adu 


is called the basic reproductive ratio. Graph solution curves for (4.13.74), with 
various choices of parameters. Account for the assertion that if Ro < 1 the virus 
cannot maintain an infection, but if Rp > 1 the system converges to an equilibrium, 
in which v > 0. 


(4.13.75) Ro 


14. The simplifying assumption that the virus population is proportional to the 
infected cell population (say Gu = by) leads to the system 


d. 

7 = \— dx — bry, 
(4.13.76) . 

oj —ay + bry. 


dt 
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Study this system, with an eye to comparison with the Volterra-Lotka system 


(4.13.11). Here, replace (4.13.75) by 
(4.13.77) Ro = a 
slo. 0= aa 


15. The following system modifies (4.13.76) by introducing z(t), the population of 
killer T cells, which kill off infected cells, thereby negatively affecting y: 


d. 
=\—dx — bry, 
d 
(4.13.78) . = bry — ay — pyz, 
d 
- = cyz — bz, 


now with positive parameters \, d,b,a,p, and c. Continue to define Ro by (4.13.77). 
Consider particularly cases where 


5 
(4.13.79) Ro >, (= = ‘) > b. 


Account for the assertion that in this case the virus population first grows, stimu- 
lating the production of killer T cells, which in turn fight the infection and lead to 
an equilibrium. 


For more on these models, see [34] and [52], and references therein. 


4.14. Competing species equations 


The following system models the populations x(t) and y(t) of two competing species: 


a = ax(1— br) — cry, 
(4.14.1) dt 

dy 

ie ay(1 — By) — yay. 


In this model, each population is governed by a logistic equation in the absence of 
the other species. The presence of the other species reduces the population of its 
opponent, at a rate proportional to ry. Setting X = br and Y = Gy produces an 
equation like (4.14.1), but with X(1— X) and Y(1—Y) in place of x(1— br) and 
y(1 — By), and with different factors. A change of notation gives the system 


dx 

a ax(1— 2x) — cry, 
(4.14.2) . 

a = ay(1—y) — yay. 


which we will consider henceforth. We call this system CSE. We take a,c,a,y € 
(0,00). Associated to this system is the vector field 


a(l—2)-« 
(4.14.3) Ve ce #) oy 
ay(1— y) — yzy 
Note that V(z,0) = (ax(1 — x),0)! and V(0,y) = (0,ay(1 — y))*, so the z-axis 
and y-axis are invariant under the flow ' generated by V. Hence the quadrant 
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{x > 0,y > 0}, which is the region of biological significance, is invariant under ®*. 
Note also that 


(4.14.4) Vi@)= ee ee 7 sa Ve ae -y) = w) 


so ®! leaves invariant the region 


(4.14.5) B={(x,y):0<2,y< If, 
fort > 0. 

The vector field V has the following critical points, 
(4.14.6) (0,0), (0,1), (4,0), 


and a fourth critical point (2, yo), satisfying 
(4.14.7) cyo =a(l—20), yo = a(1— yo). 
A calculation gives 


a-—c oriy 
Yor a 


(4.14.8) to =a 


aa —cy’ aa—cy 


The point (xo, yo) may or may not lie in the first quadrant. We investigate this 
further below. 


We have 


(4.14.9) DV(0,0) = é ) ; 


so (0,0) is a source. Also, 
ac 0 -—a —c 
(4.14.10) DV (0,1) = ( in “a , DV(1,0) = ( oe ) 


and each of these might be a saddle or a sink, depending on the signs of a — c and 
a—vy. Next, 


_ [a(1 — 2x9) — cyo —cxrg 
BY Oa ( —7Y0 a(1 — 240) — 70 


aS a =) 
—yyo ayo)’ 


the second identity by (4.14.7). Hence 


(4.14.11) 


det DV (xo, yo) = (aa — cy)xXoYo, 
(4.14.12) (20, Yo) = ( y)£oyo, 
Tr DV (zo, yo) = —a%o — ayo. 


At this point, it is natural to consider the following cases of CSE: 


CASE I. a>canda>y. 
Case ll a<canda<y. 
Case III. a>canda<y. 
CasE IV. a<canda>y. 
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Figure 4.14.1. Case I of CSE 


In Case I, we see from (4.14.10) that 


4.14.13 (0,1) and (1,0) are saddles. 
In this case, aa > cy, so, by (4.14.8), 
4.14.14 ro >0, yo>OdD, 


and the critical point (9, yo) is in the first quadrant. Then we see from (4.14.12) 
that 


4.14.15 det DV (x0, yo) > 0, Tr DV (2x0, yo) < 0, 
so 

4.14.16 (to, yo) is a sink. 

We have 

4.14.17 6'(a,y) — (a0, yo) as t + +00, 


whenever x > 0 and y > 0. The two competing species tend to an equilibrium of 
coexistence. The phase portrait for this case, with a = 2,a@ = 2,c = 1,7 = 1, is 
illustrated in Figure 4.14.1. 


In Case II, we see from (4.14.10) that 
(4.14.18) (0,1) and (1,0) are sinks. 


In this case, aa < cy, so, by (4.14.8), again (4.14.14) holds, and the critical point 
(0, yo) is in the first quadrant. We see from (4.14.12) that 


(4.14.19) det DV (xo, yo) < 0, 
sO 


(4.14.20) (x0, yo) is a saddle. 
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Figure 4.14.2. Case II of CSE 


The phase portrait for this case, with a la l,c = 2,7 = 2, is illustrated 
in Figure 4.14.2. For almost all initial data (x,y) in the first quadrant, ®'(z,y) 
tends to either (0,1) or (1,0) as t + +o0. One species or the other tends toward 
extinction, depending on the initial conditions. 


In Case III, we see from (4.14.10) that 
(4.14.21) (0,1) is asaddle and (1,0) is a sink. 


From here two subcases arise, depending on the relative size of aa and cy. 


Case IIIA. aa>cy. 
This time, by (4.14.8), 


4.14.22 ty >0, yo<9, 

so the critical point (xo, yo) is not in the first quadrant. We see from (4.14.12) that 
4.14.23 det DV (xo, yo) < 9, 

so 

4.14.24 (x0, Yo) is a saddle. 

The phase portrait for this case, with a = 2,a = 1,c=1/4,y = 2, is illustrated in 
Figure 4.14.3. We have 

4.14.25 6'(z,y) — (1,0) as t > +00, 


whenever x > 0 and y > 0. Species y tends to extinction. 


Case IIIB. aa<cy. 
This time, by (4.14.8), 


(4.14.26) % <0, yo>OD, 
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< 


Figure 4.14.3. Case IIIA of CSE 


and again the critical point (%o,yo) is not in the first quadrant. We see from 
4.14.22) that 


4.14.27) det DV (xo, yo) > 0. 
Thus 
4.14.28) (zo, Yo) is a source or a sink, 


depending on the sign of Tr DV(zxo, yo). The phase portrait for this case, with 
a = 2,a = 1/2,c = 1,7 = 2, is illustrated in Figure 4.14.4. (In this example, 
Xo, Yo) isa sink.) Again (4.14.25) holds whenever x > 0 and y > 0. 


To summarize Case III, the flows in the first quadrant have the same qualitative 
features in the two subcases; (4.14.25) holds. The features differ outside the first 
quadrant. 


As for Case IV, this reduces to Case III by switching the roles of x and y. 


i eeeeeeeCis 
Exercises 


1. Note that if z and y solve (4.14.2), then 


(a+ y) = —ax? — ay? — (c+ y)ay +ar + ay. 


Show that there exists R € (0,00) such that 


d 
x,y > 0, o+y > R= a (aty) <0. 
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Figure 4.14.4. Case IIIB of CSE 


Deduce global existence of solutions to (4.14.2), for t > 0, given (x(0), y(0)) in the 


first quadrant. 


2. In the setting of Exercise 1, show that whenever x(0) > 0 and y(0) > 0, we have 


(x(t), y(t)) € B, given by (4.14.5), for t > 0 sufficiently large. 


3. Consider the system 


with y € (0,00). Specify when Cases I-IV hold. Record the possible outcomes, as 
regards coexistence/extinction. 


4. Consider the system 


dt 


with c € (0,00). Specify when Cases I-IV hold. Record the possible outcomes, as 
regards coexistence/extinction. 


5. Consider the system 
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with a € (0,00). Specify when Cases I-IV hold. Record the possible outcomes, as 
regards coexistence/extinction. 


4.15. Chaos in multidimensional systems 


As previewed in the introduction to this chapter, two phenomena conspire to limit 
the complexity of flows generated by autonomous planar vector fields. One is that 
orbits cannot cross each other, due to uniqueness (this holds in any number of 
dimensions). The other is that a directed curve (with nonzero velocity) in the 
plane divides a neighborhood of each of its points into two parts, the left and the 
right. This latter fact played an important role in $4.12. In dimension 3 and higher, 
this breaks down completely, and allows for far more complex flows. 


Newtonian motion in a force field in the plane is described by a second order 
2x2 system of differential equations, which is converted to a 4x 4 first order system. 
Energy conservation confines the motion to a 3-dimensional constant energy surface. 
If the force is a central force, there is also conservation of angular momentum. 
These two conservation laws make for regular motion, as seen in §§4.5-4.6. These 
are known as integrable systems. Such integrability is special. Most systems from 
physics and other sources do not possess it. For example, the double pendulum 
equation, derived in §4.9, does not have this property. (We do not prove this here.) 


Flows generated by vector fields on n-dimensional domains with n > 3 are thus 
sometimes regular, but often they lack regularity to such a degree that they are 
deemed chaotic. Signatures of chaos include the inability to predict the long time 
behavior of orbits. The lack of a formula for the solution in terms of elementary 
functions is one source of this inability, but there are deeper reasons. In particular, 
numerical approximations to the orbits of these flows reveal a sensitive dependence 
on initial conditions and other parameters. Furthermore, phase portraits of these 
orbits look complex. 


Research into these chaotic flows takes the study of differential equations to the 
next level, beyond this introduction. We devote this section to a discussion of two 
special cases of 3 x 3 systems, to give a flavor of the complexities that lie beyond, 
and we provide pointers to literature that addresses the deep questions raised by 
efforts to understand such systems. 


Lorenz equations 


The first example is the following system, produced by E. Lorenz in 1963 to 
model some aspects of fluid turbulence: 


c= oly “7 x), 
(4.15.1) y =rx—y—272, 
z' =ay—bz 


An alternative presentation is 

x -—¢ o O x 0 0 
yJ=|{r -l O y] +10 0 -a y 
z 0 0 —b z 0 « 


(4.15.2) a 
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r=10 r=14 


r=18 


Figure 4.15.1. Lorenz system, r = 10, 14, 18, 28 


Denoting the right side of (4.15.2) by V(z, y, z), we see that the first matrix on the 
right side is DV(0,0,0). One assumes the parameters o,b, and r are all positive. 
Lorenz took 


4.15.3) g=10, b= 
and considered various values of r, with emphasis on 
4.15.4) r= 28. 


Phase portraits of some orbits for (4.15.1), with o and 6 given by (4.15.3) and with 
various values of r, are given in Figure 4.15.1. Each of the four portraits depicts 
he forward orbits through the point 


4.15.5) = » y=0, z=5. 


The portraits start out simple, execute a sequence of changes, as r increases, reach- 
ing substantial apparent complexity at r = 28. We discuss some aspects of this. 


First, some global results. Global forward solvability of (4.15.1) can be estab- 
lished with the help of the remarkable function 


4.15.6 f(x,y, 2) = ra? toy? +0(z—2r). 
A calculation shows that if (a(¢), y(t), z(¢)) solves (4.15.1), then 


d 
4.15.7 al ey 2) = —20(rx? + y? + bz? — 2brz). 


Clearly, there exists K € (0,00) such that 
4.15.8 B={(a,y,z) €R®: f(x,y,z) < K} 
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is a closed, bounded subset of R* and the right side of (4.15.7) is < 0 on the 
complement of B. Hence 


4.15.9) o'(B)C B, Vt>0, 
where ©! is the flow generated by V(x, y, z). Moreover, for each (x, y, z) € R°, 
4.15.10 O'(x,y,z) € B, for all sufficiently large t > 0. 

Note that (4.15.9) plus the identity ®’ = 6° o &'~* implies 

4.15.11 6'(B) C ®(B) for 0<s<t, 


so B(t) = ®'(B) is a family of closed, bounded sets that is decreasing as t 7 +00. 
Now set 


4.15.12 B= TBO =f) 2: 


teR+ keZt+ 
The set B is called the attractor for (4.15.1). We have 
4.15.13 6'(B)=B, Vt>0. 
Note that 
4.15.14 divV =-o-1-b<0, 


so results of §4.3 imply 
4.15.15 VolB = 0. 


This attractor has a simple description for small r, but becomes very complex for 
larger r. 


To proceed with the analysis, consider the critical points. The origin is a critical 
point of V for all o,b,r € (0,00). Since DV(0) is the first matrix on the right side 
of (4.15.2), we see its eigenvalues are 


eae 
(4.15.16) ptr Sts Vet+P +4o(r—1), As =-2, 
with eigenvectors 
ol 0 
(4.15.17) ve= [Ai t+o v3 = | 0 
0 1 


It follows from (4.15.16) that 


0<r<1= > DV(0) has 3 negative eigenvalues, 
(4.15.18) 


r >1= > DV(0) has 2 negative and one positive eigenvalue. 

For r > 1, the positive eigenvalue is A+ and its associated eigenvector is vi. There 
is a parallel to the results in (4.3.38) describing saddles. It is shown in [19] that 
there is a smooth 2-dimensional surface through the origin consisting of points p 
such that ®‘(p) + 0 as t + +00 and a smooth 1-dimensional curve through the 
origin consisting of points p such that ®*(p) + 0 as t > —oo. In general, a smooth 
k-dimensional surface in R” is called a k-dimensional manifold. The sets described 
above are called a stable manifold and an unstable manifold, respectively. See also 
Appendix 4.C for further discussion. 
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For r > 1, V has two additional critical points, satisfying 


(4.15.19) g=y, (r—-1l—z)¢=0, be=2", 
ie., 
(4.15.20) Cz = (£V0(r — 1), + b(r — 1), — 1). 
We have 
-c ao O 
(4.15.21) DV(Cz)= | 1 1 +€), €=VJb(r—-1). 
€ +€ -b 
Note that DV(C,) and DV(C_) are conjugate by the action of 
-1 
(4.15.22) -1 , 
1 


so they have the same eigenvalues. This mirrors the fact that (4.15.1) is invariant 
under the transformation (x,y,z) > (—2z,—y,z). Further calculations give the 
following results, when o and b are given by (4.15.3): 

DV (Cx) has 

3 negative eigenvalues for 1 <r < 1.346--- 


4.15.2 
eee 1 negative and 2 with negative real part for 1.346---<r< 24.74--- 


1 negative and 2 with positive real part for r > 24.74---. 


In the first two cases in (4.15.23), Proposition 4.3.4 applies, and for all points p 
sufficiently close to Cy, ®!(p) > Cy as t > +00, and similarly for C_. The third 
case in (4.15.23) is like the second case in (4.15.18), except the numbers are reversed. 
In such a case, there are a 2-dimensional unstable manifold and a 1-dimensional 
stable manifold through C, and similarly for C_, in the language introduced below 
(4.15.18). 


With these calculations in hand, let’s take a closer look at the four phase 
portraits depicted in Figure 4.15.1, orbits with initial data given by (4.15.5). We 
see from (4.15.1) that the z-axis is invariant under the flow for all values of the 
parameters. Furthermore, on the z-axis, 2’ = —bz. Now the initial point (0.01, 0,5) 
is close by, but for all r-values depicted, DV (0) has one positive eigenvalue, and the 
orbits push away from the origin, in a direction close to vi, where v+ is given by 
(4.15.17). The orbit from (+0.01, 0,5) spirals into the critical point C, in the first 
two portraits, where r = 10 and 14. Similarly, the orbit from (—0.01,0,5) spirals 
into C_, 


Around r © 14, something new happens. These orbits pass close to the origin. 
At a certain critical value r, ~ 14, the unstable manifold is actually a pair of 
homoclinic orbits, approaching the origin both as t + —oo and as t > +00. For 
larger values of r, the orbit from (+0.01,0,5) crosses over and spirals into C_, as 
depicted in the third portrait in Figure 4.15.1, for r = 18. Similarly, the orbit from 
(—0.01, 0,5) spirals into C,. 

This spiraling into Cx does not endure as r increases. As stated in (4.15.23), 
there is a critical r. + 24.74 past which DV(C) has two eigenvalues with positive 
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0<t<40 40 <t<80 


Figure 4.15.2. Lorenz system, r = 28, for two ranges of t 


rather than negative real part. In the fourth phase portrait of Figure 4.15.1, we 
have r = 28 > r,. The orbit starting from (0.01,0,5) approaches the unstable 
manifold of C_ and then spirals out from this critical point. After some spiraling 
out from C_, this orbit makes a jump to the vicinity of C,, approaches its unstable 
manifold, and starts spiraling out from C',. After a while, the orbit jumps back to 
the vicinity of C_, and this spiraling and jumping is endlessly repeated. 

The two phase portraits in Figure 4.15.2 show 


(4.15.24 6'(0.08,0,5), 407 <t<40(j4+1), j=0,1. 


The portraits differ in fine detail from each other, but they are fairly similar, and 
seem to reveal what is called a strange attractor. 

We make one further comment about Figures 4.15.1-4.15.2. Of course, the 
orbits depicted are curves (a(t), y(t), 2(¢)) in R?. What is shown in these figures 
are 2-dimensional projections, namely (u(t), v(t)), with 


Periodically forced Duffing equation 


Our second example arises from motion in one dimension, in a nonlinear back- 
ground field, with a periodic forcing term added, 
Px 


(4.15.25) en f(x) +rcost. 


Here r is a parameter. When converted to a first order system and put in au- 
tonomous form, this becomes 


dx | 
dt =, 
d 
(4.15.26) - = f(x) +rcosz, 
dz 
oi 


dt 
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Qi) 


Figure 4.15.3. Phase portrait for Duffing’s equation 


We take 
(4.15.27) f(x) =2-2°. 


The equation (4.15.25) is called a periodically forced Duffing equation if r 4 0. 


For r = 0, (4.15.25) is called Duffing’s equation, and it reduces to a 2 x 2 
system, whose phase portrait is given in Figure 4.15.3. We take orbits through the 
points 


k 
(4.15.28) e=VItT, k=-3,-2,0,1, y=0, 


and their mirror images about the y-axis. There are two homoclinic orbits, each 
tending to the origin as t + too. All the other orbits are closed, and lie on level 
curves of 


2 2 4 
(4.15.29) E(z,y) = + aes 


2 4 

For r 4 0, matters are more complicated, since z is coupled to (x, y) in (4.15.26). 
We need a different way to portray the orbits (x(t), y(t), z(t)). In this case, unlike 
for the Lorenz system, a linear projection of (x,y,z) space onto (u,v) space is not 
the best way to proceed. Taking into account the periodicity of the right side of 
(4.15.26) in z, we treat z =t as an angular variable, and transfer (x,y, z) space to 
(Z, 9, Z) space, with 


Z=(x+2)cost, y=y, F=(x+2)sint. 


This corresponds to taking the (x,y) plane and rotating it about the vertical axis 
x = —2. We follow this with the linear map to the (u,v) plane, u = 2 — #/2, 
v = y— 2/2. Consequently, to produce Figures 4.15.4-4.15.6, we draw curves 
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Figure 4.15.4. 3D orbits for unforced Duffing’s equation, z = /2 + 3k/10 


(u(t), o(t)), with 


(4.15.30) u(t) = (w(t) +2)(sint-S*), v(t) = vl) - 4-9, 


For initial data, we take x and y as in (4.15.28) and z = 0. We use a fourth order 


Runge-Kutta scheme. 

Figure 4.15.4 draws such curves when (z, y, z) solve (4.15.26) with r = 0. In all 
but the third portrait, the orbits lie on smooth donut-shaped surfaces (called tori). 
The third portrait depicts the homoclinic orbit, which spends most of its time near 
the origin in (a, y)-space. It lies on a surface that is smooth except along a curve, 
where it has a corner. 


Figure 4.15.5 gives this representation of orbits of (4.15.26), with 
(4.15.31) r=0.1. 


One of the four orbits seems to lie on a smooth torus (somewhat deformed). The 
other three are all apparently a mess, and also, apparently, about the same mess. 
In Figure 4.15.6 we present an enlarged version of one such orbit, with initial data 


(4.15.32) r=V2, y=0, 


An alternative to depicting orbits of the system (4.15.26)—(4.15.27) is to depict 
orbits of the associated Poincaré map, characterized as follows. Take an initial 
point p = (xo, yo0,0). Solve (4.15.26) with this initial data, and then set q = 
(a(27/), y(27), 277). The nature of the mapping on the third coordinate is trivial in 
this case, so we just consider 


(4.15.33) (x(0), y(0)) + (a(27), y(27)). 
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Figure 4.15.6. Orbit for forced Duffing’s equation, « = V2 


This is the Poincaré map associated to the system (4.15.26). 

The Poincaré map is defined in a more general context. Let X be a smooth 
vector field on 2 C R” and let S be an (n—1)-dimensional surface transversal to X, 
i.e., X is nowhere tangent to S. Under certain circumstances, one has a Poincaré 
map 


(4.15.34) P:O—-+5, 
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Figure 4.15.7. Poincaré map 


defined on an open subset O C S, where p € O and P(p) = q is the point ®4 (p) 
with smallest t > 0 such that ®4(p) € 9. See Figure 4.15.7. 


In the setting of (4.15.26), (4.15.33), the orbits for Poincaré map (x(0), y(0)) = 
(x, 0), with 
(4.15.35) x=—1.1, 1.3, V2, 1.85, and 2.1, 


are presented in Figure 4.15.8. These five initial data give rise to five orbits for the 
Poincaré map. Of these, four seem to lie along smooth curves. The orbit through 
(x,y) = (2,0) populates the fuzzy grey area, formed from 20,000 points in the 
orbit of the Poincaré map (or rather an approximation via a Runge-Kutta difference 
scheme). Note that (2, y) = (2,0) is the initial datum leading to Figure 4.15.6. 


The appearance of smooth closed curves y;, invariant under the Poincaré map, 
suggests the following. 


Assertion. Each y; bounds a region Q; C R?, smoothly equivalent to the disk 
(4.15.36) D= {xe R’: |lz|| < 1}, 


that is to say, there are smooth one-to-one maps y; : 2; — D with smooth inverses 
y;5" :D-> Q;, and the Poincaré map takes Q; into itself, i-e., 


4.15.37) P:0; 30). 


Granted this, we can make use of the following result, known as Brouwer’s 
fixed-point theorem. 


Theorem. Each smooth map 


4.15.38) v:D—3D 
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Figure 4.15.8. Poincaré map for forced Duffing, « = —1.1, 1.3, V2, 1.85, 2.1 


has a fixed point, i.e., there exists p € D such that 7(p) = p. 


See Appendix 4.G for a proof of this result. Given the assertion above, we can 
take y = yj;o Po y;* and conclude that 


(4.15.39) Pq@=a @=9;'(P)- 


Such fixed points of the Poincaré map give rise to periodic solutions to the associated 
systems of differential equations (in this case, (4.15.25)). Establishing the existence 
of periodic solutions is one of many uses for Poincaré maps. We refer to references 
cited in the next paragraph for discussions of other uses. 


Understanding how the chaotic looking orbits for the Lorenz and Duffing sys- 
tems and other systems are chaotic has engendered a lot of work. For more material 
on this, we particularly recommend the Introduction to Chaos in Chapter 2 of [16], 
which treats four examples, including the Lorenz system and the forced Duffing 
system. Other material on chaotic systems can be found in [2], [6], [18], [23], [25], 
and [30]. A detailed study of the Lorenz system is given in [41]. 
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Exercises 


1. Consider the double pendulum system, in the limit mz = 0, given by (4.9.35). 
Substitute 


(4.15.40) A(t) ¥ reoswt, w= 4 
at 

into the second equation in (4.9.35), expand in powers of r, and throw away terms 

containing second and higher powers of r. Show that you get 


£ 
(4.15.41) 65 (t) + é sin 09(t) = ru cos 02(t) coswt. 
2 2) 


Exercises 2-10 are for readers who can use numerical software with graphics capa- 
bilities. 


2. Write a program to exhibit solution curves of (4.15.41), in a fashion analogous to 
the treatment of (4.15.25), involving an analogue of (4.15.26). Try various values 
of r,g/€2, etc., and see when the behavior is more chaotic or less chaotic. 


3. In place of (4.15.41), consider the periodically forced pendulum equation 
(4.15.42) 6" (t) + isin O(t) = pcoswt, 


and write a program to exhibit solution curves of this equation, in the spirit of 
Exercise 2. 


4. Write a program to exhibit solutions to the full double pendulum system (4.9.15)— 
(4.9.16). Take, e.g., my = m2 = 1,4, = fg = 1, and variants. 


5. Examine orbits and Poincaré maps for the periodically forced Duffing equation 
for other values of r, such as r = 0.2, 0.05, 1072, 107%, etc. Also consider other 
forcing periods, i.e., replace cost by cos wt. 


Exercises 6-9 deal with systems of the form 
a (x 
(4.15.43) ie “) =-VV(za,y). 


These are 2 x 2 second order systems, which convert to 4 x 4 first order systems. 
Energy conservation leads to flows on 3-dimensional constant energy surfaces. In 
each case, write a program to exhibit solution curves (x(t), y(t)). See whether the 
displayed solutions seem to be regular or chaotic. 


6. Take 
V(a,y) = 2? +aay+y*. 
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Try various a € [0, 10]. 


7. Take 

Vi@g,y) =a*+acy+y*, ae [-2,2]. 
8. Take 

V(a,y) =a* +an7y+y*, ae [-1,1]. 
9. Take 


1 
V(z,y) = 5 ( +9") tala —a*y +y"), ae [0, 1). 


10. Taking off from models in §§4.13-4.14, see if you can construct models of 
interactions of three species that exhibit chaotic behavior. 


4.A. The derivative in several variables 


Here we present basic definitions and results on multivariable differential calculus, 
useful for the material in Chapter 4. To start this section off, we define the derivative 
and discuss some of its basic properties. Let O be an open subset of R”, and 
fF :0O-— R” a continuous function. We say F is differentiable at a point x € O, 
with derivative L, if L : R” > R" is a linear transformation such that, for y € R”, 
small, 


(4.A.1) F(x+y) = F(z) + Ly+ R(z,y) 

with ||R(x, 9)I| = o(lyll), te. 

IR@, yl 
Ilyll 


We denote the derivative at x by DF (x) = L. With respect to the standard bases 
of R” and R™, DF(z) is simply the matrix of partial derivatives, 


(4.A.2) 0 as y> 0. 


(4.A.3) DF (x) = (=) , 
OxrK 
so that, if v = (v1,..., Un)’, (regarded as a column vector) then 
OF, OF m 
(4.4.4) DF(2)v = ( ; a, 2, Oe ve) 


It will be shown below that F is differentiable whenever all the partial derivatives 
exist and are continuous on O. In such a case we say F is a C1 function on ©. More 
generally, F is said to be C* if all its partial derivatives of order < k exist and are 
continuous. If F is C* for all k, we say F is C™. 

Sometimes one might want to differentiate an R™-valued function F(z, t) only 
with respect to x. In that case, if 


F(e+y,t) = Fle,t) + Ly + Ro,y,t), 
with ||R(2, y,t)|| = o(|ly||), we write D, F (x,t) = L. 
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We now derive the chain rule for the derivative. Let F : O > R” be differ- 
entiable at x € O, as above, let U be a neighborhood of z = F(x) in R™, and let 
G:U — R¥ be differentiable at z. Consider H = Go F. We have 


H(at+y)=G(F(@+y)) 


(4.4.5) = =G(F (x) + DF(x)y + R(a, y)) 

= G(z) + DG(z)(DF(x)y + R(a,y)) + Ri(a,y) 

= G(z) + DG(z)DF(x)y + Ra(z,y) 
with 

IRaeal 9 ag y 90. 
lly 

Thus Go F is differentiable at x, and 
(4.4.6) D(Go F)(x) = DG(F(a))- DF(2). 


In case k = 1, so G: U > R, we can rewrite (4.A.6) as 
(4.A.7) D(Go F)(2) = VG(F(2))'DF(a), 
where VG(y)' = (0G/0y1,...,OG/Oym). If in addition n = 1, so F is a function 


of one variable z € O C R, with values in R™, this in turn leads to 


© G(F(c)) = VG(F(2)): F(a). 


This leads to such formulas as (4.3.10). 


Another useful remark is that, by the fundamental theorem of calculus, applied 
to y(t) = F(x + ty), 


(4.A.8) 


1 
(4.4.9) F(a+y)=F(a)+ iy DF (a + ty)y dt, 


provided DF is continuous. A closely related application of the fundamental theo- 
rem of calculus is that, if we assume F’ : O — R" is differentiable in each variable 
separately, and that each OF /0z; is continuous on O, then 

(4.4.10) 


1 
OF 
Aj(e,y) = fF (w+ aya + tyes) at 
0 OX; 


where zo = 0, 2; = (y1,--.,yj,0,--.,0), and {e;} is the standard basis of R”. Now 
(4.4.10) implies F is differentiable on O, as we stated below (4.4.4). Thus we have 
established the following. 


Proposition 4.A.1. If O is an open subset of R" and F : O > R™ is of class C?, 
then F is differentiable at each point x € O. 


For the study of higher order derivatives of a function, the following result is 
fundamental. 
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Proposition 4.A.2. Assume F : O > R™ is of class C?, with O open in R”. 
Then, for eachxe€ O, 1<j,k<n, 
O OF O OF 


4.A.11 oe) = : 


To prove Proposition 4.A.2, it suffices to treat real valued functions, so consider 
f:O-R. Forl <j <n, we set 


(4.4.12) Asnfla) = 7 (Fle + hes) ~ (2), 


where {e),...,€,} is the standard basis of R”. The mean value theorem (for func- 
tions of x; alone) implies that if 0; f = Of /Ox; exists on O, then, for 7 € O, h >0 
sufficiently small, 


(4.A.13) Ajn f(a) = 0) f(a + ajhe;), 


for some a; € (0,1), depending on x and h. Iterating this, if 0;(0;f) exists on O, 
then, for x € O, h > 0 sufficiently small, 


Ag nAj nf (@) = On(Aj nf) (e + axhex) 
(4.A.14) = Aj n(Oxf) (a + axhex) 
= OjOxf (a + aghe, + ajhe;), 
with a;,a, € (0,1). Here we have used the elementary result 
(4.4.15) OAs nf = Aj n(Orf)- 
We deduce the following. 


Proposition 4.A.3. If O,f and 0j;0,f exist on O and 0j;0,f is continuous at 
tp € O, then 


(4.4.16) 0; Ox f (xo) = jim Ax nAj,nf (20): 
Clearly 
(4.4.17) Ak nAj,nf = AjnArnf, 


so we have the following, which easily implies Proposition 4.A.2. 


Corollary 4.A.4. In the setting of Proposition 4.A.3, if also 0;f and 0,0; f exist 
on O and O,0;f is continuous at xo, then 


(4.4.18) Oj Oxf (0) = 0; f (wo). 


If U and V are open subsets of R" and F : U + V is a C! map, we say F is 
a diffeomorphism of U onto V provided F’ maps U one-to-one and onto V, and its 
inverse G = F-! is a C! map. If F is a diffeomorphism, it follows from the chain 
tule that DF (2) is invertible for each 2 € U. We now state a partial converse of 
this, the inverse function theorem, which is a fundamental result in multivariable 
calculus. 


Theorem 4.A.5. Let F be a C® map from an open neighborhood © of po € R” to 
R”, with qo = F (po). Assume k > 1. Suppose the derivative DF (po) is invertible. 
Then there is a neighborhood U of po and a neighborhood V of qo such that F : 
U + V is one-to-one and onto, and F-':V +U isaC* map. (So F:U > V is 
a diffeomorphism.) 
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Proofs of Theorem 4.A.5 can be found in a number of texts, including [31], 
Chapter 2 of [49], and Chapter 1 of [45]. 


4.B. Convergence, compactness, and continuity 


We discuss a number of notions and results related to convergence in R”, of use in 
this chapter. First, a sequence of points (p;) in R” converges to a limit p € R” (we 
write p; > p) if and only if 


4.B.1 |p; — pl| — 0. 


Here || - || is the norm on R” arising in §2.10 of Chapter 2, and the meaning of 
4.B.1) is that for every ¢ > 0 there exists N such that 


4.B.2 j 2 N= lp; — pil <e. 
A set S'C R” is said to be closed if and only if 
4.B.3 pi © S, pj p= pes. 


The complement R” \ S of a closed set S is open. Alternatively, Q C R” is open if 
and only if, given g € Q, there exists « > 0 such that B-(q) C 2, where 


4.B.4 B-(q) = {p € R” : |lp— a] < ¢}, 


so q cannot be a limit of a sequence of points in R"\ 2. 


An important property of R” is completeness, a property defined as follows. A 
sequence (p;) of points in R” is called a Cauchy sequence if and only if 


(4.B.5) lp; — Pall +0, a8 j,k ¥00. 


It is easy to see that if p; — p for some p € R”, then (4.B.5) holds. The completeness 
property is the converse. 


Theorem 4.B.1. If (p;) is a Cauchy sequence in R", then it has a limit, i.e., 
(4.B.1) holds for some p € R”. 


Since convergence p; — p in R” is equivalent to convergence in R of each 
component, it is the fundamental property of completeness of R that is the issue. 
This is discussed in [8], from an axiomatic viewpoint, and in [27] and [50], from a 
more constructive viewpoint. 

Completeness provides a path to the following key notion of compactness. A 
set KX C R” is compact if and only if the following property holds. 


Each infinite sequence (p;) in K has a subsequence 


4.B.6 
( ) that converges to a point in Kk. 


It is clear that if kK is compact, then it must be closed. It must also be bounded, i.e., 
there exists R < oo such that K C Br(0). Indeed, if k is not bounded, there exist 
p; € K such that ||p;+41|| = ||p;|| +1. In such a case, ||p; — py|| => 1 whenever j 4 k, 
so (p;) cannot have a convergent subsequence. The following converse statement is 
a key result. 


Theorem 4.B.2. If kK C R” is closed and bounded, then it is compact. 


We start with a special case. 
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Proposition 4.B.3. Each closed bounded interval I = [a,b] C R is compact. 


Proof. Let (p;) be an infinite sequence in [a,b], 7 € Z*. Divide I into two halves, 
Ip = [a, (a + b)/2], 1 = [(a + 6)/2, 6]. If p; € Ip for infinitely many j, pick some 
Pjo € Io, and set aj = 0. Otherwise, pick some p;, € ;, and set a; = 1. Set 
{0 = Pjo- 

Now divide J,, into two equal intervals, I,,9 and Ia,1. If p; € Ja,0 for infinitely 
many j, pick pj, € Ia,0, j1 > jo. Otherwise, pick pj, € La,1, ji > Jo- Set G1 = Dj,- 
Continue. 


One gets (q;), a subsequence of (p;), with the property that 
(4.B.7) la; —Qj+e] < 27% |b-al, Vk>0. 


Thus (q;) is a Cauchy sequence, so by the completeness of R, it converges, to the 
desired limit p € [a, 6). 


From Proposition 4.B.3 it is easy enough to show that any closed, bounded box 
(4.B.8) B={(a1,...,0n) ER”: a; <a; < b;, Va} 


is compact. If K C R” is closed and bounded, it is a subset of such a box, and 
clearly every closed subset of a compact set is compact, so we have Theorem 4.B.2. 


We next discuss continuity. If S C R”, a function 
(4.B.9) f:S5—R”™ 
is said to be continuous at p € S provided 


(4.B.10) pi © S, py > p => f (pj) > fp). 


If f is continuous at each p € S, we say f is continuous on S. 


The following two results give important connections between continuity and 
compactness. 


Proposition 4.B.4. If K C R” is compact and f : K + R™ is continuous, then 
f(K) is compact. 


Proof. If (q,) is an infinite sequence of points in f(K), pick p, € K such that 
f(pe) = a. If K is compact, we have a subsequence px, — p in K, and then 
dk, > f(p) in R”, 


This leads to the second connection. 


Proposition 4.B.5. If K C R” is compact and f : K + R™ is continuous, then 
there exists p € K such that 


(4.B.11) Ilf(p)|| = max ||f(@)II, 
and there exists q € K such that 


(4.B.12) Ilf(@)Il = min || F(2)Il- 


342 4. Nonlinear systems of differential equations 


The meaning of (4.B.11) is that || f(p)|| > || f(«)|| for alla € K, and the meaning 
of (4.B.12) is similar. 
For the proof, consider 


(4.B.13) g:K +R, gp) =|f(P)|l- 


This is continuous, so g(/’) is compact. Hence g(K) is bounded; say g(K) C I= 
[a,b]. Repeatedly subdividing I into equal halves, as in the proof of Proposition 
4.B.3, at each stage throwing out subintervals that do not intersect g(A) and keep- 
ing only the leftmost and rightmost amongst those remaining, we obtain a € g(K) 
and 8 € g(K) such that g(K) Cc [a, 8]. Then a = f(g) and 8 = f(p) for some p 
and q € K satisfying (4.B.11)—(4.B.12). 


A variant of Proposition 4.B.5, with a very similar proof, is that if kK C R” is 
compact and f : kK — R is continuous, then there exist p,q € K such that 


(4.B.14) f(p) = max f(x), f(g) = min f(e). 


We next define the closure S of a set S C R”, to consist of all points p € R” 
such that B.(p) 1S 40 for all e > 0. Equivalently, p € S if and only if there exists 
an infinite sequence (p,;) of points in S such that p; > p. 

Now we define sup S and inf S. First, let S C R be nonempty and bounded 
from above, ie., there exists R < oo such that x < R for alla ¢ S. Hencex << R 
for all « € S. In such a case, there exists an interval [R — k, R] whose intersection 
with S is nonempty, hence compact. We set 
(4.B.15) sup S= _ max a, 

SO[R-k,R] 
the right side well defined by (4.B.14), with f(x) = x. There is a similar definition 
of 


(4.B.16) inf S, 


when S is bounded from below. 
We establish some further properties of compact sets K C R”, leading to the 
important result, Proposition 4.B.9 below. 


Proposition 4.B.6. Let K C R” be compact. Assume X1 D X2 D X3D--: form 
a decreasing sequence of closed subsets of K. If each Xm #4, then ImXm 4 9. 


Proof. Pick tz, € Xm. If K is compact, (am) has a convergent subsequence, 
Imp > Y- Since {X%m, 2k > €} C Xm,, which is closed, we have y € NnXm.- 


Corollary 4.B.7. Let K C R” be compact. Assume U, C U2 C U3 C--+ form an 
increasing sequence of open sets in R”. If U,Um D K, then Uy D> K for some M. 


Proof. Consider X;, = K \ Um. 


Before getting to Proposition 4.B.9, we bring in the following. Let Q denote 
the set of rational numbers, and let Q” denote the set of points in R” all of whose 
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components are rational. The set Q” C R” has the following denseness property: 
given p € R” and ¢ > 0, there exists g € Q” such that ||p — q|| < ¢. Let 


(4.B.17) R = {By (qj): Gq € Q", 77 € QN (0, c)}. 


Note that Q and Q” are countable, i.e., they can be put in one-to-one correspondence 
with N. Hence RF is a countable collection of balls. The following lemma is left as 
an exercise for the reader. 


Lemma 4.B.8. Let QC R” be a nonempty open set. Then 
(4.B.18) Q=(){B: BER, Bc}. 
To state the next result, we say that a collection {U, : a € A} covers K if 


K C UaeaUa. If each Uz C R” is open, it is called an open cover of kK. If BC A 
and K C UgesUg, we say {Ug : 6 € B} is a subcover. 


Proposition 4.B.9. If K C R” is compact, then it has the following property. 
4.B.19 Every open cover {U,:a€ A} of K has a finite subcover. 


Proof. By Lemma 4.B.8, it suffices to prove the following. 

Every countable cover {B,; : 7 € N} of K by open balls 
4.B.20 : 
has a finite subcover. 

For this, we set 

4.B.21 Um = By U---UBn 


and apply Corollary 4.B.7. 
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Let F be a C® vector field on 2 C R”, with a critical point at p € Q. We say p is 
a simple critical point if L = DF (p) has no eigenvalues that are purely imaginary 
(or zero). From here on we assume this condition holds. As seen in Chapter 2, we 
can write 


(4.0.1) C"=W,eW_, 
where W is the direct sum of the generalized eigenspaces of L associated to 
eigenvalues with positive real part and W_ is the direct sum of the generalized 


eigenspaces associated to eigenvalues with negative real part. Since L € M(n,R), 
non-real eigenvalues of Z must occur in complex conjugate pairs, and 


(4.C.2) R°=V,0V_, Vi=W.NR”. 
We have 

(4.C.3) veWs ey +0 as t> Foo, 
and a fortiori 

(4.C.4) ve VE e“v +0 as t > Foo. 


We say the critical point at p is a source if V_ = 0, a sink if V, = 0, anda 
saddle if V. £0 and Vi #0. The fact that 


(4.C.5) V_=R"” 6'.(r) > p as t > +00, 


344 4. Nonlinear systems of differential equations 


for x sufficiently close to p, where ©, is the flow generated by F', was proven in 
§4.3 (cf. Proposition 4.3.4), and similarly we have 


(4.0.6) V, = R” 4.(r) > p as t oo, 
for x sufficiently close to p. The purpose of this appendix is to discuss the saddle 
case, where n+ = dim V;, > 0 and n_ = dim V_ > 0. In such a case, as advertised 


in §4.3, there is a neighborhood U of p and there are C! surfaces $+, of dimension 
n+, such that 


(4.C.7) {p} =SiNS_, 
and 
(4.C.8) xe St 64.(2) > p as t + Foo. 


The surfaces S_ and S4 are called, respectively, the stable and unstable manifolds 
of F at p. They have the further property that if y is a C! curve in S, (respectively, 
S_), and 7(0) = p, then 7/(0) € Vy (respectively, V_). In addition, given « > 0, 
there exists 6 > 0 such that if « € U \ S_ but dist(z,S_) < 6, then for some 
t1 > 0, |]©2 (a) — pl < ¢, and for all t > ty, dist(®4%(x),$4) < ¢, at least until 
0',(x) exits U. We want to demonstrate this result. For simplicity of presentation, 
we concentrate on the case n = 2 (and ny = n_ = 1). However, the argument we 
present can be modified to treat saddles in higher dimension. 


We make some preliminary constructions. Relabeling the coordinates, we can 
assume p = 0. Altering F' outside some neighborhood of p = 0 if necessary, we can 
assume F is a C® vector field on R” and there exists C’ < oo such that 


4.0.9) |F(z)|| < Cllzl], Vee R”. 


Hence, as seen in §4.3 (Exercise 3), ©4,() is well defined for all z € R", tE R. 
Applying the fundamental theorem of calculus twice gives 


4.C.10) F(x) = Le +S) xj0nGyx(x), 
jk 
where 
4.C.11) L = DF(0), 
and Gj, are C 1 vector fields, given by 
1 1 fag 
4.0.12 jk(L) = F (stax) ds dt. 
C.12) Gyn(x) | i; Da,da; (stax) ds 
We define the family of vector fields F, by 
1 
4.C.13) F.(2) = oF (ex), 
for ¢ > 0. By (4.C.10), 
4.0.14) F(a) = La + €G.(2), 
where 
4.C.15) G.(x) = S. LjLRG5R(EX). 
jk 


Passing to the limit ¢ > 0 gives Fo(a) = Lax. Results of §4.2 yield the following. 
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Lemma 4.C.1. Given 6 > 0, T < o, there exists 9 = €9(0,T, F) > 0 such that 
for alle € (0,€o], 


le <2, HST, 9%, (a) <2Vs€ [04 


4.0.16 
( ) => |p, (2) a eel <6. 


Specializing to n = 2, we can assume that 


4.C.17) L= C 5) , ab>0. 
We take the box 
4.C.18) O = {(x1, £2) : |x1|, |v2| < 1}, 
and set 
4.0.19) O, =2-*O. 
We define four families of maps 
11 ; 
4.C.20) Pej: Wej | [ 1,1], j=1,2, O<e<1, 


as follows. For 7 = 1, define t-(s) as the smallest positive number such that 
= 1 
O°) (s, 3) € {(¢,1): -l<o< ll}, 


and then set y-1(s) to be the x1-coordinate of oa (s,1/2). To give an alternative 
description, we are mapping the top edge of O, (identified with [—1/2, 1/2]) to the 
top edge of © (identified with [—1,1]) by the backward flow generated by Fz. 
Similarly define veg via the backward flow map of the bottom edge of QO, to the 
bottom edge of O, and define We and w-2 via the forward flow maps of the right 
and left edges of QO, to the corresponding edges of O. See Figure 4.C.1. It is readily 
verified that these maps are contractions for « = 0, where Fo(a”) = Lz, i.e., there 
exists A = A(a,b) < 1 such that 

Ye; (8) — Ye; (t)| < Als — tf, 
(4.0.21) lPeq(8) — Pex (t)| < Als — t] 
|\ej(8) — Wez(t)| < Als —¢], 
for all s,t € [—1/2, 1/2]. 

Results of §4.2 then establish the following. 


Lemma 4.C.2. There exist ¢) = €\(F') > 0 and A= A(F) < 1 such that whenever 
O<e<e, the maps y-; and We; in (4.C.20) are well defined on [—1/2,1/2] and 
(4.C.21) holds for all s,t € [—1/2,1/2]. 


We make a further adjustment. Take ¢2 < min(e1(F’), €o(1/10, 10, F)). Further 
shrinking €2 is necessary, arrange that, whenever € € (0, €9], 


1 
(4.0.22) lle|| < 2 => ||Fe(z) — Lal| s SIZ), 


so that, if (v1, 22) € O, F(a, £2) points down if x2 € [1/2, 1], up if v2 € [-1, -1/2], 
left if a; € [1/2,1], and right if #; € [-1,—1/2]. Now replace F by F-,, denoting 
this scaled vector field by F'. Then (4.C.16) holds with T = 10 and 6 = 1/10 for all 
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Yea(8) 
Figure 4.C.1. The maps -; and wz; 
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Figure 4.C.2. Step 0 


€ € (0, 1] and (4.C.21) holds for all s,t € [—1/2, 1/2], with A <1, for all e € (0,1). 
For notational simplicity, set 


(4.0.23) 6 =64, ¢=2-*. 


Note that dilation by the factor 2" takes the flow #4, on O; to the flow ®), on O. 


4.C. Critical points that are saddles 347 


Pie 


Ge PY 
he 


Figure 4.C.3. Beginning of Step 1 


ae 
Dr 


Pir 
an 


With these preliminaries done, we start in earnest our demonstration that the 
flow generated by F’ has saddle-like behavior near the critical point p = 0. Denote 
by 7,6,£, and R the top, bottom, left, and right edges of O, and similarly denote 
by Tk,Br,£z, and Rx the top, bottom, left, and right sides of O,. Then the maps 


4.C.20) can by slight abuse of terminology be labeled 
1: +T, Yeo: By, > B, 

4.C.24) 

We: Ri OR, Weg 1 Li > L. 


Pick two points poe, por € T such that for some toe, tor € (0,1), 


4.C.25) Poe = BS" (poe) EL, Pir = BR" (Dor) € R, 
and pick two points qoz, gor € B such that for some soe, Sor € (0,1), 
4.C.26) doe = PF (Goe) EL, dor = OF" (Gor) € R. 


See Figure 4.C.2. The possibility to do this is guaranteed by Lemma 4.C.1. Denote 
the orbits of 6) = © through por, por by Yor, Yor and those through gov, gor by 
O0£; 90r- 

Let us call the construction just described Step 0. To continue, at Step 1, 
pick pie,Pir € Ti and qGie, dir € Bi such that the following holds. Note that 
2Pie,2Pir € T and 210, 241, € B. We require that, for some typ, ti, € (0,71), 


(4.C.27) OY (pre) EL, i" (Bir) ER, 

and for some 81, $1, € (0,71), 

(4.C.28) DY (Ge) EL, BY" (2qir) € R. 

The conditions (4.C. ie and ad C.28) are equivalent to 

(4.C.29) ‘Pie €L1, Ph, = OF (Pir) € Ra, 
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Figure 4.C.4. Completion of Step 1 


and 
(4.C.30) Ge= Or (He) E41, FG, = OF" (Gir) € Ri. 


See Figure 4.C.3. Denote the orbits of F through pie, pi, by y1e,71r and those 
through qie, Gir by o1¢, 01. When picking pie, p1,, ic, and qi,, one can and should 
enforce the following condition. If yog intersects 71, pig should be to the right 
of such an intersection, if yo, intersects 71, D1, should be to the left of such an 
intersection, and similarly for cases when oo, or oo; intersect 6,. Also, we can take 
T, > 1. (More on this below.) 


Now we continue the orbits y1¢, 71-, 01, and o1, forward and backward, until 
they intersect the boundary of O, at points pie, Pir, Me, Gir and Py, Pts Tips Irs AS 
illustrated in Figure 4.C.4. That such an intersection must occur is guaranteed by 
(4.C.22). This, together with the fact that orbits of F' cannot intersect, guarantees 
that 


(4.C.31) Poe < Pie < Pir < Por; 


in the sense that p < p’ means p is to the left of p’. In a similar sense, made clear 
in Figure 4.C.4, we have 


doe < Me < Gr < Gor, 
(4.C.32) Por < Pir < Gr < rs 
Poe < Pie < Ge < Qe: 
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Furthermore, as a consequence of (4.C.21), we have 

\pie — Pir| S AlP1e — Pir| < A, 
lave — Mr| < Alte — Gir] < A, 
Pir — Girl S Alpi, — Girl S A 
Pie — diel < AlPie — Girl < A. 


(4.0.33) 


We proceed iteratively. At step k, pick pre, Per € Te and Gee, Ger € By such that 
the following holds. Note that 2° px,2*fpr € T and 2*Gxe,2* Gur € B. We require 
that, for some txe, thr € (0, Th), 


4.0.34) Oi (2 pe) CL, Bit" (2* inp) ER, 

and for some sxe, Skr € (0, Tx), 

4.0.35) BE (Gee) CL, BF (2 Gun) ER. 

The conditions (4.C.34) and (4.C.35) are equivalent to 

4.C.36) Pie = OF (Bee) € Lk, Bir = OF" (Per) € Re, 
and 

4.0.37) Ge = OF (Gee) © Le, Ter = OF” (Gur) © Re. 


Denote the orbits of F through the points pxe, Der by Yee, Yer, and those through 
the points Gxe,dkr by One, kr. When picking pre, Per, Gee, and Gxr, one can and 
should enforce the following condition. If 7,—1,¢ intersects 7;,, pee should lie to the 
right of such a point of intersection, if y,~1,. intersects 7, Prr Should lie to the 
left of such a point of intersection, and similarly for cases where o,~1,¢ OY OK—1,r 
intersect B;. At this point it is useful to note that, by Lemma 4.C.1, we can take 


(4.0.38) Tk > 00 as kv, 
and hence take (with z = ¢ or r) 
(4.0.39) 2" Pez —(0,1)|| < me, 12*@ez -— (0, DI < me, Me 4 0 as k > oo. 


It then follows that (again with z = @ or r) 
(4.0.40) 
Pee - (27°, 0)|| 27m, IGe2 — (27,0) S 2" Tie + 0 a8 k > 00. 


Now we continue the orbits ye, Yer, 7K¢, and ox, forward and backward, until 
they intersect the boundary of O, at points pre, Per, Che, Cer, ANd Pip, Pers Tees Ter 
as illustrated in Figure 4.C.5. That such intersections must occur is guaranteed by 
(4.C.22). As before, the fact that orbits of 4 cannot intersect guarantees that 


(4.C.41) Poe < +++ < Pee < Dkr < +++ < Por; 
in the sense specified in (4.C.31), and, as in (4.C.32), 


qoe S++ < Gke < Ukr <*** < Gor; 
(4.C.42) Por < 0° < Dir < Ver <0 < Gro 
Poe <*7* < Phe < Whe <*7* < Ge: 
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Pké Pkr 
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Pre Pkr 


Ge Ver 


ke Qkr 


Figure 4.C.5. Step k 


Furthermore, as a consequence of (4.C.21), we have 

Ipre — Perl < A*|Pre — Ber| < A*2-*np., 
laxe — Gurl < A*|Gxe — Ger| < A*2-*np., 

(4.0.43) * * ky) ~* ~k kyg-k~ 

[Pee — Pkrl S A |Pke — Bhr| < A’ 2 "ihe, 

Ide — dir S AN ke — dir] < A 2- "ie. 

In particular, these distances are converging to 0 quite rapidly. We obtain limits 

Pke;Pkr +> Pe T, Wes Ukr + Po € B, 

Phe Ger > Pr © R, Phe dhe > De € L. 

See Figure 4.C.6. We have 


(4.0.44) 


(4.0.45) (pt), Bi(pp) +0 as t > +00, 
and 
(4.0.46) D5(p7), Op (pr) + 0 as t+ —oo, 


since the paths in (4.C.45) meet each O, for large positive t and those in (4.C.46) 
meet each O, for large negative t. Furthermore, by (4.C.39)—(4.C.40), plus the fact 
that all these paths solve dx/dt = F(x), the curves in (4.C.45) fit together to form 
a C? curve tangent to the x-axis at p = 0, and those in (4.C.46) fit together to 
form a C1 curve tangent to the a-axis at p = 0. 

We sketch how to treat the case n = 3, n- = 2, n_ =1. In place of (4.C.17), 
we can take 


(4.0.47) t= (“ .») , AEM(2,R), b>0, 
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CU 


Figure 4.C.6. Limiting configuration as k + oo 


and, via Lemma 4.3.5, arrange that 

4.C.48) Av-v>allv|?, a>0, Vu eR. 
In place of (4.C.18), we use the cylinder 

4.C.49) O = {(x1, 22,23): 2? +232 <1,|a3| < 1}. 
with boundary 

4.C.50) 0O=TUBUEL, 


where 7 and B (the top and bottom) are disks and L (the side) is $1 x [-1, 1]. We 
then take O; = 2-"O, with boundary 7; U By U Lx. Parallel to (4.C.24), we have 
at least for small ¢) maps 
ga: TT, 2: Bi >B, 

We :Li 7 LS 


4.0.51) 


with y.; defined by backward flow of Of, and w- defined by forward flow. Again 
the maps y.; are contractions for small ¢. The maps 7, are not contractions, but 
composing them on the left with the projection S$! x [-1,1] > [-1,1] produces a 
contraction, for small ¢, and this is what one needs. In place of a pair of initial data 
on J and a pair on B, one takes a circle of initial data on 7 and one on B. Applying 
7, yields a pair of flared tubes, as pictured in Figure 4.C.7. From here, an iteration 
produces nested families of such flared tubes, converging in on the one-dimensional 
stable manifold S_ and the two-dimensional unstable manifold S,. The interested 
reader is invited to fill in the details, and work out the higher dimensional cases. 
See also [10] and [19] for other approaches to this result. 
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Oe een ek 

cc £ 
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Figure 4.C.7. Set-up for a saddle in 3D 


4.D. Blown up phase portrait at a critical point 


Let 2 Cc R” be open, assume 0 € Q, and let F : Q > R” be a smooth vector field. 
Assume F'(0) = 0, and that 0 is a nondegenerate critical point, i.e., 


4.D.1 A= DF(0) € M(n,R) is invertible. 

We want to blow up the portrait of solutions to 

4.D.2 x’ = F(a), 

by using spherical polar coordinates, 

4.D.3 r=rw, r=r(t)€[0,o), w=w(thes”', 


where $"~? is the unit sphere in R". Note that 


4.D.4 g=rwtra’, 


and w(t) - w(t) = 1 =>’ 1 w, so the two vectors on the right side of (4.D.4) are 


mutually orthogonal. We obtain 

r=a-w=w- F(rw), 
(4.D.5) ; Senay = 
w=r Pir’ =r PF (rw), 


where P,, is the orthogonal projection of R” onto the orthogonal complement of 
Spanw, ie., 
(4.D.6) Pov =v—(v-w)w. 

To proceed, write 


(4.D.7) F(z) = Av+ R(a), 
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where 


R(x) = v(fa — 8)0;0,F (sx) ds) LiL 


(4.D.8) = Py(f'a — 8)0;0,F (srw) ds) jth 


jk 
= r’G(r,w). 
Here, if Br(0) Cc 2, we have 
(4.D.9) G:(—R,R) x S"-! —5 R”, smooth. 


Thus we can rewrite the system (4.D.5) as 

r! = (w+ Aw)r tw: G(r,w)r?, 

w! = P,Aw +rP,,G(r,w). 

This is a smooth system of ODE on (—R, R) x $"-1. 
Now the null space V(P,,) is Spanw, and 

(4.D.11) P,, Aw = Aw — (w+ Aw)w, 


sO 


(4.D.10) 


P, Aw =0 <=> Aw || w 
(4.D.12) : : 
<=> w is an eigenvector of A, 


in which case the associated eigenvalue \ is necessarily equal to w+ Aw (and this 
eigenvalue is nonzero if A is invertible). Consequently, the right side of (4.D.10) 
vanishes on 

(4.D.13) {(r,w):r=0, we S"71, Aw = (w- Aw)wh, 
and we have the following. 

Proposition 4.D.1. [f (4.D.1) holds, there exists a > 0 such that, on 

(4.D.14) Oa = (—a,a) x $71, 

the right side of (4.D.10) vanishes only on the set (4.D.13). If each real eigenspace 


of A is one dimensional, then each critical point of (4.D.10) is isolated. 


Solutions to (4.D.10) define a flow on Og, and {(r,w) € Og : r = O} is invariant 
under this flow. The flow restricted to this set is the flow on S$”! defined by 


4.D.15 w’ = P,, Aw, 
which is 
4.D.16 w(t) = |le’Awo||~te'4wo, wo = w(0). 


One convenient way to visualize the flow on O, is to use the diffeomorphism 
4.D.17 ®:0, > U, CR", 
given by 
4.D.18 O(r,w) =e"'w=y, 
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Figure 4.D.1. Saddle critical point and its blowup 


where 
(4.D.19) Ua = {y € R® 2 e7% < |y| < e*}. 
The resulting flow on Ug, acting on y = e"w, is defined by 
y =er'wtew' 

= ev [(w > Aw)r+w- G(r,w)r?]w 


(4.D.20) 
+ P,,[Aw + rG(r,w)] \ 


= X(y), 
where G(r,w) is given by (4.D.8). Note that 
1 
(4.D.21) G0,w) = 5 S50; F (0)wjwr. 
ik 


We see that X is a smooth vector field on U, whose critical points lie on S"~! C Ua, 
and wo € S"~! is a critical point of X if and only if 


(4.D.22) Aw g = Aw, 
for some A € R, necessarily 

(4.D.23) A= Wo: Awo, 
and A # 0, given (4.D.1). 

It is of interst to specify when such a critical point wo of X is nondegenerate, 
ie., when DX(wo) € M(n,R) is invertible. To begin this analysis, we see from 
4.D.20) that 
4.D.24 DX (wo)wo = Aw. 

We next evaluate DX (wo)€ when wo is a critical point and € L wo. We have from 
4.D.20) that 


d 
4.D.25 DX (wo)E = —P.s) Aw(s)| 


ds s=0" 


where 


4.D.26 w:(—e,e) > S"-1,  w(0) =u, w'(0) = €. 
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€2 


€1 


Figure 4.D.2. A generic sink and its blowup 


1 


Figure 4.D.3. A nongeneric sink and its blowup 


A computation gives 


d 
4.D.27 Geos) Aw(8)| 0 = P,,, A€ — X€, 
provided (4.D.22) holds, and € L wa. Hence 

4.D.28 DX (wo)€ = Pas, AE — AE, 

provided wp is a critical point of X and € | wo. It follows that, if € L wo, 


4.D.29 DX (wo) (aw + €) = arw + P., AE — XE, 


so wo € $”~1 is a nondegenerate critical point of X if and only if 
aAwo + Pay AE ~~ rE = 0, é L Wo 
= a=0 and €=0. 


4.D.30 


Since A # 0, we have the conclusion a = 0, so the criterion boils down to 


4.D.31 E lwo, Py, AE-— AE =O €=0. 
Now, for € 1 wo, 
4.D.32 P,, AE — AE =0 (A — ADE € Span wo. 


If dim €(A, A) > 2, one has nonzero € € E€(A, A) orthogonal to wa, so nondegeneracy 
requires 


(4.D.33 dim €(A, \) = 1. 
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Figure 4.D.4. A spiral sink and its blowup 


Furthermore, if dimV((A — AI)?) > 2, this space contains a nonzero element € 
orthogonal to wo, and, by (4.D.33), (A—AI)E € Span wo, so nondegeneracy requires 


(4.D.34) dimN((A—AJ)?) =1, hence M((A—AJ)?) = E(A, A). 
We have the following: 


Proposition 4.D.2. Maintain the hypothesis (4.D.1). Let wo € S"~! be a critical 
point of X, so (4.D.22) holds, with X € R, X #0. Then wo is a nondegenerate 
critical point of X if and only if 


4.D.35) GE(A,r) = E(A,), and dim€(A,,) = 1. 


Figures 4.D.1—4.D.4 depict the blowups of critical points of four planar vector 
fields. These have the form (4.D.7) with A € M(2,R) given, respectively, by 


4.D.36) (’ an) (= a), Go ats fe ak 


The critical points are, respectively, a 


4.D.37) saddle, generic sink, nongeneric sink, spiral sink. 

We provide a rather sketchy depiction of the phase portraits of the blowups. We 
sketch orbits on S! = {|y| = 1}. Other orbits sketched are confined to {|y| > 1}, 
corresponding to {r > 0}, and we only sketch orbits that lead to (or from) critical 
points of X on $1, except for the spiral sink, where X has no critical points. 


Figure 4.D.1, depicting the blowup of a saddle, has +e; and +e2 as critical 
points of X, each one in turn being a saddle. Figure 4.D.2, depicting the blowup 
of a generic sink, also has +e; and +e. as critical points of X. In this case, +e; 
are saddles and +e are sinks. In Figure 4.D.3, depicting a non-generic sink, the 
only eigenvectors of A in S! are +e, and these are the only critical points of X. 
Consistent with Proposition 4.D.2, these critical points are degenerate, and the 
orbits pictured here illustrate this. Figure 4.D.4 deals with a spiral sink. In this 
case, A has no real eigenvectors, so, as noted above, X has no critical points on S!. 
This figure illustrates how orbits of X spiral into S!. 


In Figure 4.D.5 we depict the blowup of a 3D saddle. In this case, F(x) has 
the form (4.D.7), with 


Ay 
(4.D.38) A= r2 Fi 3 <0< 2 < ML. 
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ae 


Figure 4.D.5. Blowup of a 3D saddle 


<> 


Figure 4.D.6. Blowup of a 3D spiral sink 


In this situation, the critical points of X on S? = {|y| = 1} are ej, 1 <j < 3, 
each of which is a saddle, though the saddles are of three different types. When 
one restricts the flow generated by X to S?, one sees that +e; are sinks, te. are 
saddles, and +e3 are sources. Note the heteroclinic orbits connecting these various 
critical points. 


In Figure 4.D.6 we depict the behavior of the blowup of a 3D spiral sink. In 
this case, F(x) has the form (4.D.7), with 


(4.D.39) A= e ) . B= Ge =) , ft, Ag € (—00, 0). 
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The eigenvalues of B are —1 + pi. There are two critical points of X on S?, at 
te3. Figure 4.D.6 depicts the case 43 = —2. In this case, +e3 are spiral saddles 
for X. When one restricts the flow generated by X to S?, the points te3 are spiral 
sources. This is somewhat analogous to the saddle behavior of te; in Figure 4.D.2. 
In addition, the equator of S? (in the (e1e2)-plane) is an attracting cycle for the 
flow generated by X. 


The reader is invited to consider the behavior of blowups in case (4.D.39) when 
one takes A3 = —1/2, or —1. 


4.E. Periodic solutions of 7” + 2 = ev)(z) 


Equations of the form 
(4.E.1) wv" +2 =ey(z) 


with small € arise in a number of cases, and it is of interest to analyze various 
features of these solutions. For example, as mentioned in 84.6, the relativistic 
correction for planetary motion gives rise to the equation (4.6.53), which takes the 
form (4.E.1) for 2 = u— A, with 


(4.E.2) w(x) = (@ + A). 
Another example, 
(4.5.3) H(e) = 2°, 


yields a special case of Duffing’s equation. As we mentioned in §4.6, solutions to 
(4.6.53) tend not to be periodic of period 27, and this leads to the phenomenon of 
precession of perihelia. It is of general interest to compute the period of a solution 
to (4.E.1), and we discuss this problem here. We assume ~ is smooth. 


We rewrite (4.E.1) as a first order system and also explicitly record the depen- 
dence on ¢€: 


x, (t) = ye(t), 

y-(t) = —ae(t) + ev(axe(t)). 
We pick a € (0,00) and impose the initial conditions 
(4.E.5) xz-(0)=a, y-(0) =0. 
Note that if we take 


(4.B.4) 


2 2 


yy 
(4.E.6) F.(t,y) = 9 + in eW (x), 


where W(x) = w(x), then (d/dt)F.(x-(t), ye(t)) = 0 for solutions to (4.E.4), so 
orbits of (4.E.4) lie on level curves of F’:. For ¢€ sufficiently small with respect to 
a, the level curves of F, on {(a,y) : 2? + y? < 2a?} will be close to those of Fo, 
that is to say, such level curves of F. will be closed curves, close to circles, and the 
associated solutions to (4.E.4)—(4.E.5) will be periodic. The period Te) will have 
the following two properties, at least for small ¢: 


(4.E.7) T(e)=27+O(e), ye(T(e)) =0. 


We will calculate a more precise approximation to T(€), accurate for small ¢. 
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The first order of business is to calculate accurate approximations to x,(t) and 
y-(t), valid uniformly for ¢ in an interval containing [0,27]. It follows from §4.2 that 
x(t) and y-(t) are smooth functions of e. Hence, for each N € N, we can write 


N 
x(t) = acost +S X4(t)e* + Rin(t,€), 


(4.E.8) Le 
N 
ye(t) = —asint + FS Y,(t)e* + Ron(t,€), 
k=1 
where 
(4.E.9) IRjn(t,e)| <CeneNt!, Vit] < K. 


We write Rjn(t,e) = O(eN*). The coefficients X;,(t) and Y;(t) satisfy differential 
equations, obtained as follows. We have from (4.E.8) 


N 
(4.E.10) we (t) + re(t) = [XE + XeJe* + O(E%*), 
k=1 
while 
(4.E.11) 
N 
ew(a-(t)) = ev(a cost + d Xe") as O(eN*) 
= _ . 
= | d(acost) + > av (acost)(S2 Xx (e*) 4 O(eNt), 
jai" k=1 


We match up the coefficients of e* in (4.E.10) and (4.E.11) to obtain equations for 
X;(t). The case k = 1 gives 


(4.E.12) X(t) + Xi(t) = v(acost), 

and from (4.E.5) the initial conditions are seen to be 

(4.B.13) X1(0)=0, X4(0) =0. 

The solution to (4.E.12)-(4.E.13) is given by Duhamel’s formula, cf. (3.4.9) of 

Chapter 3: 

4.E.14) Xi (t) = i: sin(t — s) w(acoss) ds. 
0 


It is convenient to expand sin(t — s) and rewrite (4.E.14) as 


t t 

4.E.15) X(t) = (int) [ cos s w(acos s) ds — (cos »f sins ~(acos s) ds. 
0 0 

Regarding Y;,(t), we have 

4.B.16) Y(t) = XE(2), 


for all k, and in particular 


t t 
4.E.17) Yi(t) = (cos » | cos s #)(acos s) ds + (sin » | sins w(acos s) ds. 
0 0 
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In case 7() is given by (4.E.2), we have 


a? a? 
(4.E.18) wW(acoss) = 7 cos 2s + 2aA cos s + (4? + =); 
hence 
t 
t cos s w(acos s) ds 
(4.E.19) 0 a , 
wef Ae: MON ee j Gs 
= (A + ri ) sint + 5 sin 2t +4 1D sin 3t + aAt, 
and 
t 
| sin s ~(acos s) ds 
(4.E.20) 0 Rion a4 , : 
2 i 7 | a 2 7 = Me =] a = 
= (A oa x) (A + ny) cost > cos 2t 7g (083. 


One can compute higher terms in (4.E.8). For example, matching up coefficients 
of €? in (4.E.10) and (4.E.11) yields 
(4.E.21) X49 (t) + Xo(t) = y'(acost)X1(t). 
Again X2(0) = X4(0) = 0, and, parallel to (4.E.14), we have 


(4.E.22) X(t) = sin(t — s) w'(acoss)X,(s) ds. 


Y2(t) is given by (4.E.16). One can continue this, but we will leave off at this point. 


We return to the problem of approximating the period T(<), making use of 
4.E.7). A very effective method for solving y-(T) = 0 with T ~ 27 is Newton’s 
method, which gives T(e) as the limit of T,,(€), defined recursively by 


Ye(Tn(€)) 
Ye(Tn(€)) 
A general treatment of Newton’s method is given in Chapter 5 of [50].) This 


sequence converges fast: 


4.B.24 T(e) = Tale) + Ole’), 


4.E.23) To(é) = 2n, Tn4i(€) c Tn(€) 


provided one has y-(t) evaluated exactly. Given an approximation to y.(t), 
4.E.25 ye(t) = Ge(t) + O(e%),  yL(t) = H(t) + Ole”), 


we can work with T,,(e), given by 


: ‘ e 5.(T, 

4.B..26 The)=2n, Tanle) = Tle) — BD), 
9 (Tre) 

and we get 

4.E.27 T(e) =Ty(e) + O(e%), provided 2” > N. 


In particular, taking 


4.E.28 ye(t) = asint + Y;(t)e + O(e?), 
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we have 
(4.B.29) T(e) =Ti(e) + Ole’), 
with 

cad Epa Ye (27) 
(4.E.30) MO Fen) 


1 
=2n+—Yi(2z)e, 
a 
hence, by (4.E.17), 


20 
4.E.31 T(e) = 20+ = | cos s 4)(acos s) ds + O(e?). 
a Jo 
In case 7)(a) is given by (4.E.2), we have from (4.E.19) that 


Qr 
4.E.32 | cos s ~(acos s) ds = 2maA, 
0 


so in this case 


4.E.33 T(e) = 2n(1 + Ae) + O(e?). 


Given an approximation 9.(t) satisfying (4.E.25) with N = 3 or 4, we can 
iterate (4.E.26) once more, obtaining To(e) = T(e) + O(e%), and so on. We will 
not pursue the details. 


We now return to the problem of approximating the solution (x-(t), ye(t)) of 
(4.E.4), and address a limitation of the approximations of the form (4.E.8). As 
follows from (4.E.15)-(4.E.20), the first order approximation has the form 


x-(t) = acost + Xi (te + Ole”), 
ye(t) = —asint + Y;(t)e + O(e?), 
and, in the case that w(z) is given by (4.E.2), 

X(t) = X(t) + aAtsint, 
Y,(t) = YP(t) — aAt cost, 


(4.B.34) 


(4.E.35) 


where X(t) and Y;(t) are periodic in t, of period 27, being sums of products 
of sinkt and coskt (0 < k < 3). In (4.E.34), the notation O(c?) means that, 
for any given bounded interval [-K, K], the remainder is bounded by Cxeé?, for 
t € [-K,K]. However, it is apparent from (4.E.35) that the accuracy of this 
approximation breaks down severely on intervals of length ~ 1/e. In fact, both 
xe(t) and y-(t) are uniformly bounded, being periodic of period T(¢). As far as the 
terms on the right side of (4.E.34) are concerned, 


(4.E.36) acost+X%(t)e and —asint+Y/P(t)e 
are uniformly bounded, of period 27, but 
(4.E.37) aAetsint and — aAetcost 


are unbounded as |t| + oo. These terms are called secular terms, and it is desirable 
to have a replacement for (4.E.8), in which such secular terms do not appear. To 
get this, we proceed as follows. 
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The functions 


T(e)t T(e)t 
4B. #(t) =: , y(t) = Ye 
(4.B.38) of()=2-("), vf @ =v(=>) 
are periodic of period 27 in t and smooth in e. Hence we have expansions 

N 
a#(t) = acost + > Xf (He* + O(N), 
(4.E.39) oy 
yf (t) =—asint+ ~ VF te" + O(e%*?). 
k=1 
Note that 
d T(e) 
2 #4) # 
(4.8.40) at) = Sout, 


which leads to a variant of (4.E.16). We have the following. 
Proposition 4.E.1. The solution to (4.E.4)-(4.E.5) has the expansion 


(4.B.41) 


Each term in this series is periodic in t of period T(e), and the remainders are 
O(eNt!) uniformly for allt € R. 


It is natural and convenient to set 


(4.E.42) Xo(t) = XP (t) =acost, Yo(t) =¥i*(t) = —asint. 

It remains to compute x (t) and Y(t) for k > 1. To this end, set 
T(€) £ 

(4.E.43) oS 14+ (€), ye) =e > Yee. 


e>0 
If we compare the expressions for x-(t) in (4.E.8) and (4.E.41) and make the sub- 
stitution s = 2nt/T(e), we obtain 


Yo XP (s)e* = D° Xi(s + y(€)s)et 


k>0 i>0 


(4.E.44) = = s GRP ere 


We conclude that X#f(s) is equal to the coefficient of e* in the last power series. 
For k = 0, we get 


(4.E.45) X#(s) = X0(s) = acoss, 
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as already noted in (4.E.42). For k = 1, we get 


X#(s) = X1(s) + y08X4(s) 


(4.E.46) 
= X1(s) — yoassins. 


When ~(x) is given by (4.E.2), we have from (4.E.35) that this is 


= X°(s) + aAssins — yoassin s 


4.E.47 
: = Xi(s), 


the last identity by (4.E.43) and (4.E.33), which gives yo = A in this case. Alter- 
natively, since X #(s) and X?(s) are periodic in s and the other terms are secular, 
these secular terms have to cancel. This holds for general ¢)(x); X. #(s) is obtained 
from X1(s) by striking out the secular terms. One can similarly characterize the 
higher order terms Xe (t) in (4.E.39). We forego the details. 


We end this appendix with an indication of how to extend the scope of (4.E.1). 
We treat the pendulum equation 


4.E.48 u” +sinu =0, 
and seek information on small oscillations, solving (4.E.48) with initial data 
4.E.49 u(0) =ave, u’(0) =0. 


Thus we set 


4.E.50 a(t) = 


which solves 


sin /eEx 


(4.E.51) vc’ +—L—=0, 2(0)=a, 2'(0)=0. 
Ve 
If we set 
sin T 2 1 7 

(4.E.52) 2 1—7°F(r), F(t) = 3 Bt 
we get 

a" +a = er F (Jer) 
(4.E.53) aga 

Heap ae ati 


This has a form similar to (4.E.1), but generalized to 
(4.E.54) ve’ +2 =ev(e,2), 


with 7 smooth in (e,2). Treatments of the solutions to (4.E.1) and their periods 
T(e) extend to the case (4.E.54). The reader is invited to work out details. 
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4.F. A dram of potential theory 


Newton’s law of gravitation states that the force a particle of mass m, located at 
p € R? exerts on a particle of mass mz located at « € R® is 

p-x 
Ilp — ||>° 
Here G is the gravitational constant, given by (4.6.64). As indicated in Exercise 
6 of §4.6, the force that a planet exerts on an external body is the same as what 
would be exerted if all the mass of the planet were concentrated at its center, in 
the Newtonian theory. In this appendix we explain why this is true, and in the 
course of doing so introduce an area of mathematical analysis known as potential 
theory. We establish this identity of force fields under the hypothesis that the mass 
distribution of the planet is spherically symmetric about its center. That is to say, 
we assume the planet, centered at p, has mass density p, and 


4.F.2 p(p+ Ry) = p(p+y), VRE O(3), yER’, 


where we recall from Chapter 2 that O(3) is the set of orthogonal transformations 
of R*. Say the planet has radius a, so 


(4.F.1) F(x) = Gmim2 


4F3 yl > «=> p(pt+y) =0. 


The planet’s mass is 
4.F.4 m, = [ow dy. 


If a particle of mass mz is located at x € R® and ||p — 2|| > a, then the force the 
planet exerts on this particle is given by 


4.F.5 =Gm J< 
i af yap 
We will show that if (4.F.2)—(4.F.4) hold and ||p — 2|| > a, then F(x) = G(z). 


For notational simplicity, we may as well take 
4F.6 p=9, 
so 


4.F.7 F (2) = —Gmim2.—~ 


Note that 
4.F.8 F(x)=-VV(a), G(x) =—-VW(z), 
with 
Gm m2 


eal 


4.F.9 V(2) =- / We) =-Gine | ray dy, 
llvlis@ 
so it suffices to prove that these potential energies coincide for ||2|| > a, ie., 
(4.F.10) |z|| > a => V(2) = W(2). 
As a first step toward proving (4.F.10), note that clearly, for all R € O(3), 
(4.F.11) V(Rex) = V(a), 
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and furthermore 


W(Re) =-Gm, fou) ay 


1 
226 | Rz)d 
(4.F.12) ™ f [Re — Rah) 


1 
= -Gm: f pole)ae 
=W(e), 


the second identity by change of variable and the third by (4.F.2). Consequently, 
we have 
(4.F.13) V(x) =v(r), Wa) =w(r), r= lel 
and it remains to show that 
(4.F.14) r>a=v(r)=w(r). 
As another step toward showing this, we note that, given a € (0,00), there exists 
C < © such that 
1 1 < Cc ' 
el] lz — yl}! ~ [ehh 
and hence, by (4.F.4), (4.F.9), and (4.F.13), there exists Cz < oo such that 


r= [lel] > a +1 |V(e) -— W(@)| < mat 


C2 


rp 


(4.F.15) yl <a, lle] 2a+1 


(4.F.16) 


lu(r) — w(r)| < 


The next step toward establishing (4.F.14) involves the following harmonicity, 
4.F.17 AV(a) =0, Va €R*\0, 
where A is the Laplace operator, 

Of of Of 
~ Ox? * Ax2 " Ax? 


To see this, recall from (1.4.11) that (on R?) 


4.F.18 Af (a) 


2 
4.F.19 F(z) = g(r) = Af() = 9"(r) + 29'(r), 
and by results on Euler equations from §1.15, 
2 C 
4.F.20 g(r) + ~9'(r) =0 g(r) = = +2. 
Since 
Gmimo 
4.F.21 V(x) = u(r) = -———., 
r 
we have (4.F.17), and hence we also have 
4.F.22 a(—_) =0 for c#y, 
lz — yll 
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so a direct consequence of the integral formula (4.F.9) for W(z) is 

(4.F.23) AW (a) =0 for ||2|| >a. 

Hence, by (4.F.18), (4.F.19), and (4.F.20), 

2 

r>a=sw"(r) + -w'(r) =0 

(4.F.24 r 
C1 

=> w(r)= a + 2, 


for some constants cy and cz. This identity together with (4.F.21) and (4.F.16) 
proves (4.F.14). Hence we have (4.F.10), so indeed, under the hypotheses (4.F.2)— 
(4.F.4) (and with p = 0), 


(4.F.25 ||z|| > @ => F(x) = G(2). 


Nu 


We mention the following refinement of (4.F.23), 
(4.F.26 AW = 4nGmop. 


This is not needed to establish (4.F.14), so we will not prove it here. A proof 
can be found in [45], Chapter 3, §4. Further exploration of the relation between 
the Laplace operator and the potential function W, through (4.F.9), leads to the 
subject of potential theory, addressed in Chapters 3-5 of [45] and in other books 
on partial differential equations. 


The Earth, the sun, and other planets and stars are approximately spherically 
symmetric, but not exactly so. This leads to further corrections in calculations in 
celestial mechanics. In addition, measurements of the strength of the earth’s grav- 
itational field give information on the inhomogeneities of the earth’s composition, 
leading to the field of physical geodesy; cf. [20]. 


4.G. Brouwer’s fixed-point theorem 


Here we prove the following fixed-point theorem of L. Brouwer, which arose in $4.15. 
Take 


(4.G.1) D = {xe R?: |jal| <1}. 
Theorem 4.G.1. Each smooth map F : D - D has a fixed point. 


The proof proceeds by contradiction. We are claiming that F(a) = x for some 
x € D. If not, then for each x € D define y(zx) to be the endpoint of the ray from 
F(x) to x, continued until it hits 


(4.G.2) OD = {a €R?: Jal] = 1}. 
An explicit formula is 


Vb? + dac—b 
p(t) =at+t(a—F(a)), t= ar, ye 
a=|z— F(a), b=22-(e—F(e)), c=1-[lall?. 
Here t is picked to solve the equation ||x + ¢t(x — F(a))||? = 1. Note that ac > 0, so 
t > 0. It is clear that y would have the following properties: 


(4.G.4) y:D-0D smoothly, 21€0D => y(x) =z. 


(4.G.3) 
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Such a map is called a smooth retraction. The contradiction that proves Theorem 
4.G.1 is provided by the following result, called Brouwer’s no-retraction theorem. 


Theorem 4.G.2. There is no smooth retraction y : D — OD of D onto its bound- 


ary. 


Proof. This proof, also by contradiction, brings in material developed in 84.4. 
Suppose we had such a retraction y. Consider the closed curve 


(4.G.5) 7: [0,27] — OD, y(t) = (cost, sint), 
and form 

(4.G.6) ys(t) = p(sy(t)), OSs<1. 
This would be a smooth family of maps 

(4.6.7) ys: [0,27] + OD, 74(0) = 7527), 


such that 7 = 7 and y(t) = (0) for all t. The variant of Lemma 4.4.2 given in 
Exercise 13 of $4.4 implies 


(4.G.8) [ro - dy is independent of s € {0, 1], 
Ys 


for each C! vector field F defined on a neighborhood of 0D and satisfying (4.4.4). 
Clearly the line integral (4.G.7) is 0 for s = 0, so we deduce that 


(4.G.9) [ro -dy =0 
Y 


for each such vector field. In particular, this would apply to the vector field given 
by (4.4.19)—-(4.4.20), ie., 


(4.G.10) F(x) = inp ee) 


which is smooth on R? \ 0 and satisfies (4.4.4) (cf. (4.4.21)). On the other hand, 
we compute 


20 
fro -dy= | (— sint, cost) - (— sint, cost) dt 
0 


(4.G.11) ; 


= 2, 


contradicting (4.G.9) and hence contradicting the existence of such a retraction. 


The fixed-point theorem is valid for all continuous F : D > D. In fact, an 
approximation argument, which we omit here, can be used to show that if such 
continuous F has no fixed point, there is a smooth approximation F : D — D that 
would also have no fixed point. 

Furthermore, Theorem 4.G.1 holds in n dimensions, i.e., when 
(4.G.12) D={xeER": |la|| < 1}. 


The reduction to Theorem 4.G.2, in the setting of (4.G.12), is the same as above, 
but the proof of Theorem 4.G.2 in the n-dimensional setting requires a further 
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argument. Proofs using topology can be found in [14] and [32]. Proofs using 
differential forms can be found in [26], [49], Chapter 5, [45], Chapter 1, and [46], 
Appendix G. We have no space to introduce differential forms here, but as shown in 
[45], and also in [1] and [5], they give rise to many important results in the study 
of differential equations, at the next level. 


4.H. Geodesic equations on surfaces 


The notion of a geodesic on a surface in Euclidean space was introduced in 84.7. 
Here we say a bit more about this. Let M C R* be a smooth, m-dimensional 
surface (classically, k = 3,m = 2), and let u: [a,b] > M be a smooth curve. The 
length of this curve is 


b 
(4.1.1) Lu) =f w'@llae 


where ||u!(t)||? = u’(t) - u(t), the dot product taken in R*. We consider smooth 
curves that are length minimizing, among curves with the same endpoints. Such 
curves are called geodesics. They have the following property. Let u, be a smooth 
family of curves satisfying 


(4.H.2) Us : [a,b] —+ M, us(a)=p, us(b) =4, 
with uo =u. Then L(us) > L(uo) for all s, so 

d 
(4.H.3) Ge Llus)| <0 =0. 


In other words, wo is a critical point of the length functional. We define the term 
“eeodesics” to include all such critical paths. 


The quantity L(u) is unchanged under reparametrization. We will reparametrize 


uo so that ||up(¢)|| = co is constant. Then 
d acy? 1/2 
FUe)loao = ae f (welt) welt)? atl 
(4.H.4) ; ae F 
= =u, (t) - ui (t) dt 
se | pet) Will) ttl, 
Equivalently, 
d ld 
(4.H.5) ge Es) | x0 = cp dons) leo" 
where 
1 
(4.6) Blue) = 5 fuse) a 


is the energy of the curve u, : [a,b] + M. This is exactly the energy functional 
introduced in (4.7.27), and the analysis in (4.7.28)—(4.7.33) applies. 

In particular, if k = m+ 1 and n(x) is the unit normal to M at x, then the 
condition that u = uo is a critical path for E is given by (4.7.32), ie., 


(4.H.7) u(t) = —u'(t)- < n(u(t)) n(u(t)). 
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Now suppose that M is a level set of a smooth function f : O — R, defined on an 
open set O C R*, ie., for some c € R, 

4.H.8) M={xeO: f(x)=c}, Vf #0onM. 

Then, for z € M, 


V f(a) 
4.H.9) rz) = ——., 

IV f(z)ll 
and the ODE (4.H.7) can be written 
ul(t) - D? f(u(t))u'(t) 

IV f (u(t) ||? 

where D? f(x) is the k x k matrix of second order partial derivatives of f at 2. 
Passing from (4.H.9) to (4.H.10) uses the fact that a path u(t) on M satisfies 
u'(t) -n(u(t)) = 0. The equation (4.H.10) can be written as a first order system: 


4.H.10) u(t) = Vf(u), 


u =v, 
(4.H.11) of = PFS bey), 
IVF)? 

Solutions to the system (4.H.11) define a flow 
(4.H.12) F':TO—+TO, TO=OxR*. 
We have 
(4.H.13) F*:TM — TM, F'|,,=9%, 
where 
(4.H.14) TM ={(a,v) €OxR*: 26M, ve TM}, 
where T,,M denotes the set of vectors v € R* tangent to M at x, ie., 

v-Vf(e) =0, 

and 
4.H.15) g':TM TM 
is the geodesic flow, i.e., for v € T, M, 
4.H.16) G*(x,v) = (y(t), 74), 
where y(t) is the constant speed geodesic on M satisfying 
4.H.17) 7(0)=2, y(0)=v. 
However, as we illustrate below, it is often the case that 
4.H.18) TM is an unstable invariant surface for the flow F*. 


This has consequences for the numerical treatment of geodesic curves on M. Indeed, 
one has the important task of stabilizing a numerical approximation to the flow F‘, 
so it reliably acts on TM. 
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Numerical considerations 


To be specific, suppose we apply a fourth order Runge-Kutta scheme to the 
first order system (4.H.11), with step size h. We start at a point (z,v) © TM. At 
time h we obtain an approximation 


4.H.19) (&(h), 9(h)) to F" (ax, v). 
In fact, 

4.H.20) (&(h), G(h)) = F" (a, v) + O(h’). 
In particular, 

4.H.21) dist ((@(h), 0(h)), 7M) = O(h). 


We can construct a retraction of a neighborhood of TM in TR* onto TM, and 
apply this to get an approximation 
(#(h), 0(h)) = G"(w,v) + O(h), 

(&(h), (h)) € TM. 
This provides a useful modified Runge-Kutta approximation to the geodesic flow 
G’ on TM. It remains fourth order accurate. 


(4.H.22) 


We next examine how the instability advertised in (4.H.18) arises in case M is 
a (k — 1)-dimensional ellipsoid in R*. 


Geodesic equations on ellipsoids 
Here we look at the ellipsoid M, C R*, given by 


4.H.23) M. = {xe R*: f(x) =c}, 

where 

4.H.24) f(x) =2-Ax, A=A' € M(k,R), positive definite. 

We pick c € (0,00). In such a case, we have 

4.1.25) Vf(a) =2Aa, D? f(x) = 2A, 

and the geodesic equation (4.H.10) becomes (with slightly different notation) 
4.11.26) Pps O22 @ a. 


~ Ax(t)- Ax(t) 


We write this as a first order system: 


4.H.27 oo 
eo v' = —y(2,v)Az, 
with 
4.11.28) ae! 


pla,v) = 20 

{| Az||? 
Note that, for x € M., 
4.H.29) ve T,M. v-Ar=0. 
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The system (4.H.27) generates a flow F* on TR* = R**, specializing to the 
geodesic flow Gt on TM, C R?*. Under G’, acting on (a,v) € TM, the quantity 
||v|| is constant on each orbit. This need not be the case on other orbits of F*, as 
we will soon see. 

In fact, we have the following computations for solutions (x,v) to (4.H.27): 


d 


(4.H.30) He Ag = 2u- Ag, 

d 
(4.H.31) allel? =2Qv-v' = —2(x,v)v- Az, 
and 


d 
qe Ate Aa te: Ae 


(4.H.32) =v- Av — y(z,v)||Aa||? 


=0. 
In particular, on each orbit 7 of F', the quantity v- Az is constant, say 
(4.H.33) v- Ax =k ony, 
and then (4.H.30)-(4.H.31) yield 


ae - Ax = 2k, 
(4.H.34) i 


d 

J llol? = -29(e, 2)e. 
We see that, on such an orbit, 

(4.H.35) x(t) - Ax(t) = c+ 2xt. 


If x # 0, then, as c+ 2Kt \, 0, one has x(t) > 0 and, by (4.H.34), ||v(t)|] 7 co. 
Hence the solution to (4.H.27) ceases to exist at c+ 2Kt = 0, and we avoid the 
absurd conclusion that x(t) - A(t) < 0 for c+ 2«t < 0. 


4.1. Rigid body motion in R" and geodesics on SO(n) 


Suppose there is a rigid body in R”, with a mass distribution at t = 0 given by 
a function p(x), which we will assume is piecewise continuous and has compact 
support. We also assume p > 0 and it is not identically zero. Suppose the body 
moves, subject to no external forces, only the constraint of being rigid. We want 
to describe the motion of such a body. According to the Lagrangian approach to 
mechanics, we seek a critical path of the integrated kinetic energy, subject to this 
constraint. 


If €(t, x) is the position at time t of the point on the body whose position at 
time 0 is x, then we can write the Lagrangian as 


(4.1.1) 16) = 5 ff o(a)léte.2)P deat 
0) Rn 


Here E(t, «) = 0€/dt. 
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Using center of mass coordinates, we will assume that the center of mass of the 
body is at the origin, and its total linear momentum is zero, so 
(4.1.2) &(t,z) =W(t)z, W(t) € SO(n), 


where SO(n) is the set of rotations of R", introduced in §2.12. Thus, describing the 
motion of the body becomes the problem of specifying the curve W(t) in SO(n). 
We can write (4.1.1) as 


i fi 
(4.1.3) I(é) = J(W) = 3 [olen a? ae at 
to R” 
We look for an extremum, or other critical point, where we vary the family of paths 
W : [to, ti] + SO(n) (keeping the endpoints fixed). 
We want to reduce the formula (4.1.3) for J(W) to a single integral, over t. To 
do this, we bring in the following. 


Lemma 4.1.1. Jf A,B € M(n,R), then 


(4.1.4) Joe) (Az, Ba) dx = Tr(ATZ,B'), 
R” 
where I, € M(n,R) is defined by 
(4.1.5) Loe foe) ca! dex. 
Rn 


Proof. It suffices to note that 
(4.1.6) (Az, Br) = Tr(Arz'* B’), 


as a consequence of the identity (x,y) = Tr zy’, for x,y € R”, regarded as column 
vectors. 


Note that Z, is a symmetric, positive-definite n x n matrix. Now, using (4.1.4), 
we can write the Lagrangian (4.1.3) as 


ie is 
J(W) = sf Tr(W'(t)Z,W’(t)') dt 
(4.1.7) 
‘l: £ 
=5[ e0v'o,w'wyat, 
to 
where Q, is the inner product on M(n,R) defined by 
(4.1.8) Q,(A, B) = Tr(ATZ,B'). 


Note that this inner product is invariant under left multiplication by elements of 
SO(n), ie., 

(4.19) W € SO(n) = Q,(WA, WB) = Tr(W AT, B'W~') 

= Q,(A, B). 
On the other hand, for W € SO(n), 


(4.1.10) Q,(AW, BW) = Tr(AWZ,W—'B'), 
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which is equal to Q,(A, B) for all A,B € M(n,R) if and only if WZ, = Z,W. In 
turn, this holds for all W € SO(n) if and only if Z, is a scalar multiple of the 
identity matrix I. 


Finding a critical path W : I + SO(n) for (4.1.7) is a constrained variational 
problem, similar to those described in (4.7.21)—(4.7.27). Parallel to (4.7.28), the 
condition for a path W to be critical is 


(4.1.11 W"(t) L TwwSO(n), Vitel, 
orthogonality being with respect to the inner product Qp, i-e., 
(4.1.12 A€ Ty) SO(n) = Q,(W"(t), A) = 0. 


Given V € SO(n), we can define the vector space Ty. SO(n) as the space of all 
matrices W’(0), for smooth curves W : (—e,e) > SO(n) satisfying W(0) = V. For 
example, 


(4.1.13 T;SO(n) = Skew(n) = {X € M(n,R): Xt = —X}, 
and, for V € SO(n), 


Ty SO(n) = {VX : X € Skew(n)} 
={YV:Y € Skew(n)}. 


(4.1.14 


Comparison with (4.H.1)—-(4.H.6) shows that these critical paths are geodesics on 
SO(n), where the length of a curve W : [to, t1] > SO(n) is given by 


4.1.15) L,(W) = [ Q,(W'(t), W'(t))1? dt. 


C 


To proceed, we see from (4.1.12)—(4.1.14) that the condition for W : I > SO(n) 
o be a critical path for (4.1.7) is 


4.1.16) Tr(W(t) |W" (t)Z,X) =0, VX € Skew(n), 
upon setting A = W(t)X in (4.1.12). It is convenient to bring in 
4.1.17) Z(t) = W(t)-'W'(t), 


and derive an equation for Z(t) from (4.1.16). First, note the following (which 
echoes part of (4.1.14)). 


Lemma 4.1.2. If W : I + SO(n) is a smooth curve, then 
(4.1.18) Z(t) € Skew(n), Vte I. 


Proof. Differentiating W(t)'W(t) =I gives 
W'(t)'W(t) = —W(t)'W'(t), 


hence 


which gives (4.1.18). 
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To recast (4.1.16) in terms of Z(t), note that (4.1.17) yields 

Z'(t) = W(t) !W"(t) — W(t) W(t) W(t) W(t) 
= W(t)-'W"(t) — Z(t)”. 

Now, given B € M(n,R), 

(4.1.20 Tr(BX) =0VX € Skew(n) — > B= Bt. 

Hence the condition (4.1.16) is equivalent to the statement that 

(4.1.21 [Z'(t) + Z(t)]Z, is symmetric. 


) = 
4.1.19 
( = 


If we denote the matrix in (4.1.21) by B and compute B — B', we arrive at the 
following result. 


Proposition 4.1.3. If we define Z(t) by (4.1.17), the condition that W : I + SO(n) 
be a critical path for (4.1.7) is equivalent to 


4.1.22 Z'(t)Z, +Z)Z'(t) + Z(t)?L, — I,Z(t)? = 0. 
To work on (4.1.22), let us define 
1 

4.1.23 Ly : Skew(n) + Skew(n), L,X = 3 (XZ +T,X). 
Then (4.1.22) can be written 

4.1.24 2L£pZ'(t) — [Zp, Z(t)"] = 0, 
where, generally, [A, B] = AB — BA. In turn, if we set 

1 

4.1.25 M(t) = £,Z(t) = 5(Z()Z, +2,4(t)), 
and note that 

4.1.26 (Zp, Z7] = 2[M, Z], 

we can recast (4.1.24) as 

4.1.27 M'(t) = (M(t), Z(0)], 

or equivalently 

4.1.28 M'(t) = [M(t),£7'M(t)], 


a system of ODE with a quadratic nonlinearity. The following result leads to 
valuable information about M(t). 


Proposition 4.1.4. Suppose (4.1.27) holds fort € I, that ty € I, and My = M(tg). 
Then there exists U : I + SO(n) such that 


(4.1.29) M(t) =U(t)MoU(t)"!, tel. 


Proof. We produce a linear ODE for U(t). Differentiating (4.1.29) gives 
M'(t) = U'(t)MoU(t)~! — U(t)MoU(t)1U' (Ut)! 
(4.1.30) = U'(t)U(t)-! M(t) — M(t)U’(t)U(t)71 
= (M(t), Z(0)], 
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provided Z = —U'U—1, ice., 
(4.1.31) U'(t) = U(t) Z(t). 


To obtain (4.1.29), take U to solve (4.1.31), with U(to) =I, and verify that Z(t) € 
Skew(n) > U(t) € SO(n). 


Note that (4.1.29) implies 
(4.1.32) || M(e)|| = || Mol], Vee TZ. 
Hence, by Proposition 4.1.2, we have the following. 


Proposition 4.1.5. Given to € R and initial data M(to) = Mo € Skew(n), the 
system (4.1.28) has a unique solution for allt € R, M :R- Skew(n). 


Having a solution to (4.1.28), we can retrace our steps, obtaining Z(t) = 
L,'M(t), satisfying (4.1.22), and then solve the linear system 
(4.1.33) W'(t) =W(t)Z(t), W(to) = Wo € SO(n), 
to obtain a critical path for (4.17). 


The identity (4.1.32) says the operator norm ||M(¢)|| is a conserved quantity 
for solutions to (4.1.28). We record some other conserved quantities. 


Proposition 4.1.6. For each solution M : R — Skew(n) to (4.1.28) and each 
k EN, the quantities 


(4.1.34 Tr M(t)?" 
are independent of t. So is 
(4.1.35 Q,(Z(t), Z(t), 


with Z(t) = L5'M(t). 


Proof. From (4.1.29) we have 
(4.1.36 M(t)?* = U(t)M3*U(t)“, 


and taking traces yields (4.1.34). To get (4.1.35), note that, when W : R > SO(n) 
is a critical path for (4.1.7), then 


(4.137  QolW'(t), W'(t)) = 2@,(W"(E), W'(t)) = 0. 
the last identity by (4.1.11)-(4.1.12). Since Z(t) = W(t)-1W'(t), (4.1.9) gives 
(4.1.38 Q,(Z(t), Z(t) = Q,(W'(t), W"(e)), 


and we have (4.1.35). 


Note. The conserved quantities listed in Proposition 4.1.6 include two quadratic 
forms in Z, namely 

Q,(Z, Z) = — Tr Z1,Z, 
(4.1.39) 


1 
Qm(Z, Z) = 4 Tr(ZZ, +2,Z)*. 
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Let us specialize to n = 3. Assume the standard basis {e1, e2,e3} of R® diago- 
nalizes Z,, and set 


0 —2£3 x2 ay 
(4.1.40) Z=n(xr)= | 2x3 0 -a}, Zp= ag 
—Z2 Uy 0 a3 


Here the isomorphism « : R? + Skew(3) is chosen to satisfy 
(4.1.41) exy=kK(x)y, 2,y ER’, 


where x x y is the cross product on R?. Compare §2.12, Exercise 9. A calculation 
gives 


—Tr ZZ,Z = (az + a3)a7 + (a1 + a3)"3 + (a1 + a2) x3 


(4.1.42) 
=a@- Ipk, 
where 
Ip = (TrZ,)I —T, 
ag + a3 
= a + a3 
(4.1.43) a, + ag 
ay 
— a2 
a3 


Next, we have 
M=Z1,+1,Z 


(4.1.44) 1 0 —A3%3 A2X2 
SO A303 0 —ayr% |, 
—A2%2 AX) 0 
hence 


1 
Tr M* = ||Mllizs = 5(ai2t + 0303 + a3x5) 


(4.1.45 1 
— ed . as 
Note also that 
(4.1.46 M =k(Jp2). 


We want to rewrite the equations (4.1.27)—(4.1.28) as equations for 
(4.1.47 a(t)=« 'Z(t), y(t)=«K 'M(t) = J,2(t), 
using the cross product. Complementing (4.1.41), we have 
(4.148 n(a x y) = (x(x), Ry) 


see again §2.12, Exercise 9. Given this, we read off from (4.1.27) that, if 2 and y 
are given by (4.1.47), then 


dy 


(4.1.49) a 


=-“XxXy. 


4.I. Rigid body motion in R” and geodesics on SO(n) 377 


In this setting, x(t) is called the angular velocity of the rotating body, y(t) its angular 
momentum, and J, the inertia tensor. The equation (4.1.49) is the standard form 
of Euler’s equation for the free motion of a rigid body in R°. 


We can rederive the conservation laws for «- J,x and «- J, a directly from 
(4.1.49), upon noting that # x y is orthogonal to 2 and to y = Jpx, hence 


O=a- Jpx' 5 gt Sot 
(4.1.50) 2 i 
0=Ipu- Tp! = 5 Ta: Toe. 


Explicitly, the conservation laws we get are 


2 2 2 
a1 27 + Agr + 0323 = Ch, 
(4.1.51) poe ae: 


aja? + aba + aba = Co, 
since we have chosen coordinates on R® so that Jp is given by (4.1.43), and Z, by 
(4.1.40). Note also that 
(4.1.52) a, > a2 >a3 >0 = 0 < ay < a2 < az, 
and more generally a, > a2 > a3 >0S>0< aj < ag < as. 
Given that y = J,2, we have 


(a3 — ag)x2x3 
(4.1.53) xcXxy= | (ai —a3)x1 23 
(az — a1) r122 
and (4.1.49) becomes 
ayv', + (a3 — a2)r97%3 = 0, 


(4.1.54) a2ay + (a1 — a3)a123 = 0, 


a325 + (ag — a1)a1%2 = 0. 


If any of the quantities a coincide, the system (4.1.54) simplifies. For example, if 
a1 = a2, we get x3 = 0, hence x3 = €3 = const, and 
ul 


x = —7&322, 


4.1.55 
ry = yE3r1 


(with y = (a3 — a1)/a1), a constant coefficient linear system. If a1, a2, a3 are all 
distinct, as in (4.1.52), then we can deduce from (4.1.51) identities of the form 


2 2 

ty + ¥atq = Cz, 
2 2 

(4.1.56) xi + 7223 = ca, 
2 2 

+23 =c1, 


and then (4.1.54) transforms into equations such as 
(4.1.57) wl = Ax(c3 — 22)"/2(e, — 2) V2, 


etc. The equation (4.1.57) and its ae for x2 and xz are separable. One gets 


(4.1.58) i, = / dt, 
(c3 = x?) (c2 S a2) 
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Figure 4.1.1. Orbits of y’ = y x a on |y|? =1 


the left side being an elliptic integral, such as arose in the treatment of the pendulum 
in §1.6. See particularly Exercise 4 at the end of §1.6. 


An alternative presentation of (4.1.51) is 
yt + 2 + y3 = Co, 
Biyt + Boyz + Bays = C1, 


with y; = a;x;, 6; = 1/a;. One obtains variants of (4.1.54)-(4.1.58), with y; in 
place of x;. Note that 


(4.1.59) 


4.1.60) 0< a, < ag < a3 = 0 < 63 < Bo < fi. 
One variant of (4.1.56), following from (4.1.59), is 
4.1.61) (61 — B2)yt — (82 — Bs)y3 = C1 — 6202. 


Orbits y(t), solving (4.1.49) with x = Is"; lie on curves in the intersection of a 
sphere |y|? = Cy with a surface given by (4.1.61). See Figure 4.1.1 for an illustration. 
In this illustration, the observer is looking down the yo-axis, and 31 — 82 = 62-3.) 


Let us write the system (4.1.49) as 


4.1.62) —=F(y), F(y)=yx Jp*y. 


Then F is a vector field on R? that is tangent to each sphere Sc = {|y|? = C}, and 
we can regard the solution to (4.1.62) as defining a flow on each such sphere, and 
F |g. as a vector field on Sg. It has six critical points. In case C' = 1, the critical 
points are 

€1, —€1, €3, —e3, centers, 


(4.1.63) €2, —e2, saddles. 
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The result (4.1.63) has the following significance. Suppose B C R? is a rigid 
body with inertia tensor J, given by (4.1.43), whose diagonal entries satisfy the 
hypotheses in (4.1.60). Suppose 8G is set rotating, with angular momentum yo = y(0) 
at time t = 0. If yo is parallel to any of the six vectors in (4.1.63), then B will rotate 
steadily, with constant angular momentum yo (hence constant angular velocity 
Lo = Jp *yo)- Furthermore, if yo/|yo| is close to one of the four centers +e1, +e3, 
then y(t)/|y(¢)| remains close to such a center for all t. On the other hand, suppose 
yo/|yo| is close to but not exactly equal to +e. Then y(t)/|y(t)| travels along a path 
taking it close to —yo/|yo|, then back to yo/|yo|, infinitely often. Thus rotation of 
B about the e; and e3-axes is stable, but rotation about the e2-axis is unstable. (In 
this connection, note from (4.1.40) that K(e;) € Skew(3) generates rotation about 
the e;-axis.) 
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